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Upsampling 


Upsampling 
The operation of upsampling by factor L © N describes the insertion of 


L — 1 zeros between every sample of the input signal. This is denoted by " 
7 (Z)" in block diagrams, as in [link]. 


xm] | th - yln] 


Formally, upsampling can be expressed in the time domain as 
x\>| if eZ 
yln] = { ln] if 
0 otherwise 


In the z-domain, 
Y¥{z)= So ylnlz = » fe |e = So a[k]z% = X(2") 


and substituting z = e™ for the DTFT, 
Equation: 


Y (e”) = X(e”) 


As shown in [link], upsampling compresses the DTFT by a factor of L 
along with the w axis. 


Downsampling 
(Blank Abstract) 


The operation of downsampling by factor / € N describes the process of 


keeping every M*” sample and discarding the rest. This is denoted by " 
{ (M)" in block diagrams, as in [link]. 


x{m|] {M |— yjn] 


Formally, downsampling can be written as 
y|n] = 2|nM] 


In the z domain, 
Equation: 


Y(z) = 


mlm] (fp DM! ett) 237 


a 1 if mis a multiple of M 
where 5- yer eiwpm — ‘ 
P 0 otherwise 


Equation: 
¥(2) = UME elm] (ee) zt)” 
= 4 DI X(e i) zi) 


Translating to the frequency domain, 
Equation: 


As shown in [link], downsampling expands each 27 -periodic repetition of 
xX (e”) by a factor of M along the w axis, and reduces the gain by a factor 
of M. If x|m] is not bandlimited to 5, aliasing may result from spectral 
overlap. 


Note: When performing a frequency-domain analysis of systems with 
up/downsamplers, it is strongly recommended to carry out the analysis in 
the z-domain until the last step, as done above. Working directly in the e’”- 
domain can easily lead to errors. 


x{m] X(e”) 
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Interpolation 


Interpolation 


Interpolation is the process of upsampling and filtering a signal to increase 
its effective sampling rate. To be more specific, say that z[m] is an 
(unaliased) T-sampled version of x,(t) and v[n] is an L-upsampled version 
version of x|m]. If we filter v[m] with an ideal +-bandwidth lowpass filter 
(with DC gain L) to obtain y[n], then y[n] will be a +-sampled version of 


x-(t). This process is illustrated in [link]. 


We justify our claims about interpolation using frequency-domain 
arguments. From the sampling theorem, we know that T- sampling x,(t) to 
create z[n] yields 

Equation: 


53 1 Ww — Ik 
X(e ) = FLX) 


After upsampling by factor Z, [link] implies 


won FE) HE 


Lowpass filtering with cutoff + and gain L yields 


ESE) tele) 


Y ta 


since the spectral copies with indices other than k = [LZ (forl € Z) are 
removed. Clearly, this process yields a + -shaped version of x(t). [link] 
illustrates these frequency-domain arguments for L = 2. 
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Application of Interpolation - Oversampling in CD Players 


Application of Interpolation- Oversampling in CD Players 


The digital audio signal on a CD is a 44.1 kHz sampled representation of a 
continuous signal with bandwidth 20 kHz. With a standard ZOH-DAC, the 
analog reconstruction filter would have passband edge at 20 kHz and 
stopband edge at 24.1 kHz. (See [link]) With such a narrow transition band, 
this would be a difficult (and expensive) filter to build. 


|\X(e™)| |X2(j22)| |H-(j2)| 
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If digital interpolation is used prior to reconstruction, the effective sampling 
rate can be increased and the reconstruction filter's transition band can be 
made much wider, resulting in a much simpler (and cheaper) analog filter. 
[link] illustrates the case of interpolation by 4. The reconstruction filter has 
passband edge at 20 kHz and stopband edge at 156.4 kHz, resulting in a 
much wider transition band and therefore an easier filter design. 
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Decimation 


Decimation is the process of filtering and downsampling a signal to 
decrease its effective sampling rate, as illustrated in [link]. The filtering is 
employed to prevent aliasing that might otherwise result from 
downsampling. 

DC gain 1 


z{m] —=LPF fj vm “4M — yln] 


To be more specific, say that 
tit t) = x1(t) + ei (t} 


where ,(t) is a lowpass component bandlimited to 577 ur Hz and x(t) isa 
bandpass component with energy between 5 sur a nd = + Hz. If sampling 
z(t) with interval T yields an unaliased discrete representation z{m], then 


decimating x|m] by a factor M will yield y[n], an unaliased MT-sampled 
representation of lowpass component 2/(t). 


We offer the following justification of the previously described decimation 
procedure. From the sampling theorem, we have 


— 2rk 1 w — 27k 
x = 4S 
el re he a 
The bandpass component Xy(7£2) is the removed by +7-lowpass filtering, 


giving 
1 w— 27k 
2 Vigo 
T 2 (i T ) 


Finally, downsampling yields 
Equation: 


¥(e) = bp Spl! D(A | 


_ aT a oo: Xi(i ss) 
_ a Xi (i van | 


which is clearly a MT-sampled version of x;(t). A frequency-domain 
illustration for M = 2 appears in [link]. 
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Resampling with Rational Factor 


Interpolation by LZ and decimation by M can be combined to change the 
effective sampling rate of a signal by the rational factor va . This process is 
called resampling or sample-rate conversion. Rather than cascading an 
anti-imaging filter for interpolation with an anti-aliasing filter for 
decimation, we implement one filter with the minimum of the two cutoffs 
Ee | and the multiplication of the two DC gains (L and 1), as 


illustrated in [link]. 


DC gain L 


tL LPF min{ +, 77} 


Digital Filter Design for Interpolation and Decimation 


First we treat filter design for interpolation. Consider an input signal x|n| 
that is wo-bandlimited in the DTFT domain. If we upsample by factor L to 
get v(m], the desired portion of V (e™”) is the spectrum in i= 7 ) , while 
the undesired portion is the remainder of |—7, 77). Noting from [link] that 
Vie™) has zero energy in the regions 

Equation: 


ene, Cee) bez 
L L 


the anti-imaging filter can be designed with transition bands in these 
regions (rather than passbands or stopbands). For a given number of taps, 
the additional degrees of freedom offered by these transition bands allows 
for better responses in the passbands and stopbands. The resulting filter 
design specifications are shown in the bottom subplot below. 
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Next we treat filter design for decimation. Say that the desired spectral 
component of the input signal is bandlimited to 47 < 7; and we have 


decided to downsample by M. The goal is to minimally distort the input 
spectrum over as ae), i.e., the post-decimation spectrum over 


[—wy, wo). Thus, we must not allow any aliased signals to enter [—wy, wo). 


To allow for extra degrees of freedom in the filter design, we do allow 
aliasing to enter the post-decimation spectrum outside of |—w, wo) within 
|—7, 7). Since the input spectral regions which alias outside of [—wo, wo) 
are given by 

Equation: 


ore TOO. 2(k +1) — wo heEZ 
L L 


(as shown in [link]), we can treat these regions as transition bands in the 
filter design. The resulting filter design specifications are illustrated in the 
middle subplot ([link]). 
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Noble Identities 


The Noble identities (illustrated in [link] and [link]) describe when it is 
possible to reverse the order of upsampling/downsampling and filtering. We 
prove the Noble identities showing the equivalence of each pair of block 
diagrams. 


The Noble identity for interpolation can be depicted as in [link]: 


fn] ef ge Ly ret) Le vm) xf] ef arte) LL Le iy 


For the left side of the diagram, we have 
Y= H(z”)Vi(z) 


where V;(z) = X(z*) 
while for the right side, 
where V2(z) = H(z)X(z) 
Y(z)= H(z”) X(z’) 
Thus we have established the Noble identity for interpolation. 


The Noble identity for decimation can be depicted as in [link]: 


v(m] | - wv.) vein - 
z{n] —+|M on H(z) —~y[m] <> 2[n] —~ A(2™)}—+ | M -~ y|m] 


For the left side of the preceding diagram, we have 


V, _i ae (—1) 57k a7 
2). = TV; S e z 
k=0 


Equation: 


while for the right side, 
Equation: 


where V2(z) = X(z)H(z™) 


Equation: 
Y(z) = 4 gt X (eliza ) (ea 2H ) 
= H(2)4 DMG X(e tem) 


Thus we have established the Noble identity for decimation. Note that the 
impulse response of H (Z* ) is the L-upsampled impulse response of H(z). 


Polyphase Interpolation 


Polyphase Interpolation Filter 


Recall the standard interpolation procedure illustrated in [link]. 


ele] a] th |—+) (2) |—= ol 


Note that this procedure is computationally inefficient because the lowpass 
filter operates on a sequence that is mostly composed of zeros. Through the 
use of the Noble identities, it is possible to rearrange the preceding block 
diagram so that operations on zero-valued samples are avoided. 


In order to apply the Noble identity for interpolation, we must transform 
H(z) into its upsampled polyphase components H,, as \; 


DP =40j2.4406— 1}, 
Equation: 
H(z) = dinn hin i - 
Dik ye  A[kL + pz #4?) 


via k = lF|.p = nmod L 
Equation: 


via hp|k] = h|kL + pl 
Equation: 


L-1 


H(z) = S° H,(z")z” 


p=0 


Above, |-| denotes the floor operator and - mod M the modulo-M 
operator. Note that the p*® polyphase filter h,[k] is constructed by 
downsampling the "master filter" h[n] at offset p. Using the unsampled 
polyphase components, the [link] diagram can be redrawn as in [link]. 


a{n] tL ——~ H(z") (+)—= ylm] 
zt 
F, 
~ lz") eo 
z7 
zo 
~ Hy_,(z") 


Applying the Noble identity for interpolation to [link] yields [link]. The 
ladder of upsamplers and delays on the right below accomplishes a form of 
parallel-to-serial conversion. 
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Polyphase Decimation Filter 
Implementation of polyphase decimation filters. 


Polyphase Decimation 


Recall the standard decimation method in [link]. 


z{n] —- H(z 4M |— yf y(m|] 


Note that this procedure is computationally inefficient because it discards 
the majority of the computed filter outputs. Through the use of the Noble 
identities, it is possible to rearrange [link] so that filter outputs are not 
discarded. 


In order to apply the Noble identity for decimation, we must transform 
H(z) into its upsampled polyphase components H, Cag ), 

p = {0,..., M — 1}, defined previously in the context of polyphase 
interpolation. 

Equation: 


H(2) = Lon bine 
Tae GT ARM + pe)? 


viak = lar |> P = n mod M 
Equation: 


Equation: 


Using these unsampled polyphase components, the preceding block 
diagram can be redrawn as [link]. 


z{n} | Hy(z™) | (+) ~ | M|—« y[m] 
z-! | 
| Ay (z ) | 
zy! 
y-! 
Hy-i(z™) | 


Applying the Noble identity for decimation to [link] yields [link]. The 
ladder of delays and downsamplers on the left below accomplishes a form 
of serial-to-parallel conversion. 


Computational Savings of Polyphase Interpolation/Decimation 


Computational Savings of Polyphase Interpolation/Decimation 


Assume that we design FIR LPF H(z) with N taps, requiring N multiplies 
per output. For standard decimation by factor M/, we have N multiplies per 
intermediate sample and M intermediate samples per output, giving NM 
multiplies per output. 


For polyphase decimation, we have a multiplies per branch and MW 
branches, giving a total of NV multiplies per output. The assumption of a“ 
multiplies per branch follows from the fact that h[n| is downsampled by 
to create each polyphase filter. Thus, we conclude that the standard 
implementation requires / times as many operations as its polyphase 
counterpart. (For decimation, we count multiples per output, rather than per 
input, to avoid confusion, since only every M* input produces an output.) 


From this result, it appears that the number of multiplications required by 
polyphase decimation is independent of the decimation rate Z. However, it 
should be remembered that the length NV of the =; -lowpass FIR filter H(z) 
will typically be proportional to /. This is suggested, e.g., by the Kaiser 
FIR-length approximation formula 


—10 log (6,6,) — 13 
2.3240 (w) 


where A(w) in the transition bandwidth in radians, and 6, and 6, are the 
passband and stopband ripple levels. Recall that, to preserve a fixed signal 
bandwidth, the transition bandwidth A(w) will be linearly proportional to 
the cutoff =, so that NV will be linearly proportional to M. In summary, 
polyphase decimation by factor M requires NV multiplies per output, where 
N is the filter length, and where JN is linearly proportional to M. 


Using similar arguments for polyphase interpolation, we could find 
essentially the same result. Polyphase interpolation by factor L requires N 
multiplies per input, where JV is the filter length, and where JV is linearly 


proportional to the interpolation factor L. (For interpolation we count 
multiplies per input, rather than per output, to avoid confusion, since M 
outputs are generated in parallel.) 


A Group-Delay Interpretation of Polyphase Filters 


Previously, polyphase interpolation and decimation were derived from the 
Noble identities and motivated for reasons of computational efficiency. 
Here we present a different interpretation of the (ideal) polyphase filter. 


Assume that H(z) is an ideal lowpass filter with gain D, cutoff +, and 
constant group delay of d: 


Le“) if we 2,4 
H(e“’) _ L i) 
0 if we |-7,-4) A [4,7) 
Recall that the polyphase filters are defined as 
Vp,p € {0,...,L—1}: (h,[k] = h[kL + pl) 


In other words, h»|k] is an advanced (by p samples) and downsampled (by 
factor L) version of h{n] (see [link]). 


v{[n] 
h[n] . zP “1b ~ hylk] 


The DTFT of the p* polyphase filter impulse response is then 
Equation: 


where V(z) = H(z)z? 
Equation: 


Equation: 


myles) = EEE se Pa(e 
= Wu, |w| <m: (z (e'7?H(e'r))) 


d—p 


= Vuw,|w|) <7: (e-)r') 


Thus, the ideal p‘" polyphase filter has a constant magnitude response of 
one and a constant group delay of re samples. The implication is that if 
the input to the p‘" polyphase filter is the unaliased T-sampled 
representation z[n| = x,(nT), then the output of the filter would be the 


unaliased T-sampled representation yp|n] = 2¢ ( (n = ae ) i) (see 
[link]). 
z-(nT) s 
ze(t) Hy(2) | we((n—“2)r) 


[link] shows the role of polyphase interpolation filters assume zero-delay ( 


d = 0) processing. Essentially, the p™ filter interpolates the waveform +. 
way between consecutive input samples. The LZ polyphase outputs are then 


interleaved to create the output stream. Assuming that x(t) is bandlimited 
to aya Hz, perfect polyphase filtering yields a perfectly interpolated output. 
In practice, we use the casual FIR approximations of the polyphase filters 
h,|[k] (which which correspond to some casual FIR approximation of the 
master filter h[n]). 


Polyphase Resampling 


Polyphase Resampling with a Rational Factor 


Recall that resampling by rational rate a 


following three-stage process(see [link]). 


can be accomplished through the 


DC gain L 


LPF min{ }, 77 


If we implemented the upsampler/LPF pair with a polyphase filterbank, we 
would still waste computations due to eventual downsampling by M. 
Alternatively, if we implemented the LPF/downsampler pair with a 
polyphase filterbank, we would waste computations by feeding it the 
(mostly-zeros) upsampler output. Thus, we need to examine this problem in 
more detail. 


Assume for the moment that we implemented the upsampler/LPF pair with 
a polyphase filterbank, giving the architecture in [link]. 


~|M—+ y[m] 


x[n] 


27! 


faa © 


Keeping the "parallel-to-serial" interpretation of the upsampler/delay ladder 
in mind, the input sequence to the decimator g[J| has the form as in [link] 


.., vo[0], v1 [0], v2 [0], .. . , vz —1[0], 
vo[1], v1 [1], ve[1],...,vz-1[1], 
Vo[2], v1 [2], va[2], ..., vz—1[2],... 


Equation: 


qimM] 


Ymm), [LEI] 


pa hima), Ala [| 2 | = k| 


y|m| 


Thus, to calculate the resampled output at output index m, we should 
calculate only the output of branch number mM mod L at input index 


| 
LE 
wasted. The resulting structure is depicted in [link]. 


| . No other branch outputs are calculated, so that no computations are 


tin] — xm] —s| Hyg (2) |» — yl 


at output index m use: 
branch pp, = (mM)_, 
input index ny», = [eM |. 


ym] = >, Apmlk| 2[%m —k] 
k 


An equally-efficient structure could be obtained if we implemented the 
LPF/downsampler using the M/-branch polyphase decimator which was fed 
with the proper sequence of input samples. However, this structure is not as 
elegant: rather than computing the output of one particular polyphase 
branch per output sample, we would need to add all branch outputs, but 
where each branch output was calculated using a particular subset of 
polyphase taps. 


Computational Savings of Polyphase Resampling 


Computational Savings of Polyphase Resampling 


Recall the standard (non-polyphase) resampler in [link]. 


For simplicity, assume that L > M. Since the length of an FIR filter is 
inversely proportional to the transition bandwidth (recalling Kaiser's 
formula), and the transition bandwidth is directionally proportional to the 
cutoff frequency, we model the lowpass filter length as N = aL, where a 
is a constant that determines the filter's (and thus the resampler's) 
performance (independent of L and /). To compute one output point, we 
require M filter outputs, each requiring V = aL multiplies, giving a total 
of aLM multiplies per output. 


In the polyphase implementation, calculation of one output point requires 
the computation of only one polyphase filter output. With NV = aL master 
filter taps and LZ branches, the polyphase filter length is a, so that only a 
multiplies are required per output. Thus, the polyphase implementation 
saves a factor of LM multiplies over the standard implementation! 


CD to DAT Conversion 


Example: CD to DAT rate conversion 


Digital audio signals stored on compact digital discs (CDs) are sampled at 
44.1 kHz, while those stored on digital audio tapes (DATs) are sampled at 
48 kHz. Conversion from CD to DAT requires an exchange rate of 


_ 48000 _ 160 


~ 44100 +147 


Assuming that the audio signal is bandlimited to 20 kHz, we design our 
master lowpass filter with transition bands 


Vk,keEZ:(k eZ) 


Keeping the passband and stopband ripple levels below —96 dB requires a 
filter with a length N ~ 10000, implying that the standard polyphase 
resampler will require about NM = 1.5 million multiplication per output, 
or 70 billion multiplications per second! If the equivalent polyphase 
implementation is used instead, we require only — ~ 62 multiplies per 
output, or 3 million multiplications per second. 


Polyphase Resampling with Arbitrary (Non-Rational or Time-Varying) Rate 


Though we have derived a computationally efficient polyphase resampler 
for rational factors Q = <, the structure will not be practical to implement 
for large LZ, such as might occur when the desired resampling factor Q is 
not well approximated by a ratio of two small integers. Furthermore, we 
may encounter applications in which Q is chosen on-the-fly, so that the 
number L of polyphase branches cannot be chosen a priori. Fortunately, a 
slight modification of our exisiting structure will allow us to handle both of 


these cases. 


Say that our goal is to produce the 2 -rate samples x- (m#) given the pe 


rate samples z.(nT'), where we assume that ,(t) is bandlimited to 5 and 
Q can be any positive real number. Consider, for a moment, the outputs of 
polyphase filters in an ideal zero-delay L-branch polyphase interpolation 
bank (as in [link]). 


xX £,(nT Le (n+ $)T & 
=) FE e}-O— ate 


L 


) Est 
L irae elt eo | 


We know that, at time index n, the p"” and (p+ a filter outputs equal 


2e((n+2)r) 4 20( (n+ 2£1)r) 


respectively. Because the highest frequency in x,(t) is limited to ae the 


waveform cannot not change abruptly, and therefore cannot change 


significantly over a very small time interval. In fact, when L is large, the 
waveform is nearly linear in the time interval between ¢ = (n = 2)T and 


{= (n + er, so that, for any a € {0, 1), 
pt+a ( ) pt+l 
c T)=2,\1 — )T = 
v( (n+ ZL ) ) a ar ta(n+ L 
Dae (( FY ) pl 
c Tag —)T : pee Dy 2 
x ((n+ L ) ) x a 5 + ax n+ ZL 


This suggests that we can closely approximate x,(t) at any t € R by 
linearly interpolating adjacent-branch outputs of a polyphase filterbank with 
a large enough L. The details are worked out below. 


Assume an ideal L-branch polyphase filterbank with d-delay master filter 
and 7’-sampled input, giving access to x, ( (n + na ) T) forn € Zand 
p € {0,..., £ — 1}. By linearly interpolating branch outputs p and p + 1 at 


: : —d+a 
time n, we are able to closely approximate x, ( (n =F pie \r) for any 


a € |0,1). We would like to approximate y|m] = 2, (mz —d z) in this 


manner - note the inclusion of the master filter delay. So, for a particular m, 
Q, d, and L, we would like to find n € Z, p € {0,..., 2 — 1}, and 

a € [0,1) such that 

Equation: 


— T T 
(n+ poate rama - dT 


Equation: 


nib+p+a = oF 
— ron 
= (|3| ++ Oo mod 1)L 
— [3 |L+ Ey mod L| + Oo mod 1L mod 1 
— |B |L+ Ey mod 1L| + a mod 1 


where Ral eZ, 3 mod LL € {0,...,L—1}, ™ mod 1 € (0,1). 


Thus, we have found suitable n, p, and a. Making clear the dependence on 
output time index m, we write 


m= |, 
~ 
Pm = G mod 1 


L 
ean | 


Q 
and generate output y|/m] ~ x, (ms _ dz) via 


y[m] = 1 2 hy,, {k]£[Mm — k] + am hy,,+1[k] [nm — 


The arbitrary rate polyphase resampling structure is summarized in [Link]. 


+= — y{m] 


z[n]—= z[n, Neate 


Hyati(2)}>| Om | (z) 


at output index m, use 
branch pm = |(G)i L] 
input index n,, = | 3] 
weight ay, = (%5¢)1. 
ylm] = (1-am) 204 Mpa k] 2[%m — A] 
+O Dik pn +h] £ [Mn — Fi] 


Note that our structure refers to polyphase filters H,,,(z) and Hp,,41(z) for 
Pm € {0,..., 2 — 1}. This specifies the standard polyphase bank 

{ Ho(z),..., Hz_1(z)} plus the additional filter Hz,(z). Ideally the p™ 
filter has group delay =. so that Hz(z) should advance the input one full 
sample relative to Ho(z), i.e., Hr(z) = zHo(z). There are a number of 
ways to design/implement the additional filter. 


1. Design a master filter of length LV, + 1 (where N, is the polyphase 
filter length), and then construct 


Vp, p € {0,.--,L}: (p € {0,..., L}) 


Note that hz [k] = ho[k + 1] forO <k < N, — 2. 
2. Set H(z) = Ho(z) and advance the input stream to the last filter by 
one sample (relative to the other filters). 


In certain applications the rate of resampling needs to be adjusted on-the- 
fly. The arbitrary rate resampler easily facilitates this requirement by 
replacing Q with Q,,, in the definitions for 2», Dm, and am. 


Finally, it should be mentioned that a more sophisticated interpolation could 
be used, e.g., Lagrange interpolation involving more than two branch 
outputs. By making the interpolation more accurate, fewer polyphase filters 
would be necessary for the same overall performance, reducing the storage 


requirements for polyphase filter taps. On the other hand, combining the 
outputs of more branches requires more computations per output point. 
Essentially, the different schemes tradeoff between storage and 
computation. 


Multi-stage Interpolation and Decimation 


Multistage Decimation 


In the single-stage interpolation structure illustrated in [link], the required 
impulse response of H(z) can be very long for large L. 
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Consider, for example, the case where LZ = 30 and the input signal has a 
bandwidth of wo = 0.97 radians. If we desire passband ripple 6, = 0.002 
and stopband ripple 6, = 0.001, then Kaiser's formula approximates the 
required FIR filter length to be 


—10log (6pd5) — 13 
Ni ~ —10log (pds) — 18 ~ 900 
2.3A(w) 


choosing A(w) = **—*#° as the width of the first transition band (i.e., 
ignoring the other transition bands for this rough approximation). Thus, a 
polyphase implementation of this interpolation task would cost about 900 


multiplies per input sample. 


Consider now the two-stage implementation illustrated in [link]. 


We claim that, when L is large and wo is near Nyquist, the two-stage 
scheme can accomplish the same interpolation task with less computation. 


Let's revisit the interpolation objective of our previous example. Assume 
that L, = 2 and Lz = 15so that LD; Ly = L = 30. We then desire a pair 

{ F(z), G(z)} which results in the same performance as H(z). As a means 
of choosing these filters, we employ a Noble identity to reverse the order of 
filtering and upsampling (see [link]). 
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It is now clear that the composite filter G(z"’) F(z) should be designed to 
meet the same specifications as H(z). Thus we adopt the following 
strategy: 


1. Design G Cae to sea unwanted images, keeping in mind that the 
DTFT G(e*>“) is 22 20 -periodic i in w. 
2. Design F'(z) to ece the remaining images. 


The first and second plots in [link] illustrate example DTFTs for the desired 
signal z[n| and its D-upsampled version [I] , respectively. Our objective 
for interpolation, is to remove all but the shaded spectral image shown in 
the second plot. The third plot shows that, due to symmetry requirements 
G(z = will be able to remove only one image in the frequency range 


[0, ae le Due to its periodicity, however, G (z 15 ) wy removes some of the 
other undesired images, namely those centered at [> + mie “ form € Z. 
F(z) is then pe to remove the remaining ine es namely those 
centered at m2 = form € Z such that m is not a multiple of 15. Since it is 
possible that re passband ripples of F(z) and G(z" >) could add 
constructively, we specify 6, = 0.001 for both F(z) and G(z), half the 
passband ripple specified for H(z). Assuming that the transition bands in 
F(z) have gain no greater than one, the stopband ripples will not be 


amplified and we can set 6, = 0.001 for both F(z) and G(z), the same 
specification as for H(z). 
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The computational savings of the multi-stage structure result from the fact 
that the transition bands in both F(z) and G(z) are much wider than the 
transition bands in H(z). From the block diagram, we can infer that the 
transition band in G(z) is centered at w = + with width 

T— wo = v na rad. Likewise, the transition bands in F(z) have width 
Ant — 2.2 7 rad, Plugging these specifications into the Kaiser length 


30 40 
approximation, we obtain 


N, ~ 64 
and 
Ny ied 88 


Already we see that it will be much easier, computationally, to design two 
filters of lengths 64 and 88 than it would be to design one 900-tap filter. 


As we now show, the computational savings also carry over to the operation 
of the two-stage structure. As a point of reference, recall that a polyphase 


implementation of the one-stage interpolator would require Np, ~ 900 
multiplications per input point. Using a cascade of two single-stage 
polyphase interpolators to implement the two-stage scheme, we find that the 
first interpolator would require NV, ~ 64 per input point z[n], while the 
second would require N+ ~ 88 multiplies per output of G(z). Since G(z) 
outputs two points per input z[n], the two-stage structure would require a 
total of 


~ 64+ 2 x 88 = 240 


multiplies per input. Clearly this is a significant savings over the 900 
multiplies required by the one-stage structure. Note that it was 
advantageous to choose the first upsampling ratio (£1) as small as possible, 
so that the second stage of interpolation operates at a low rate. 


Multi-stage decimation can be formulated in a very similar way. Using the 
same example quantities as we did for the case of multi-stage interpolation, 


we have the block diagrams and filter-design methodology illustrated in 
[link] and [Link]. 
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Sub-Band Processing 
Why Filterbanks? 


Sub-band Processing 


There exist many applications in modern signal processing where it is 
advantageous to separate a signal into different frequency ranges called 
sub-bands. The spectrum might be partitioned in the uniform manner 
illustrated in [link], where the sub-band width A; = ae is identical for 


each sub-band and the band centers are uniformly spaced at intervals of se 


Alternatively, the sub-bands might have a logarithmic spacing like that 
shown in [link]. 
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For most of our discussion, we will focus on uniformly spaced sub-bands. 


The separation into sub-band components is intended to make further 
processing more convenient. Some of the most popular applications for sub- 


band decomposition are audio and video source coding (with the goal of 
efficient storage and/or transmission). 


[link] illustrates the use of sub-band processing in MPEG audio coding. 
There a psychoacoustic model is used to decide how much quantization 
error can be tolerated in each sub-band while remaining below the hearing 
threshold of a human listener. In the sub-bands that can tolerate more error, 
less bits are used for coding. The quantized sub-band signals can then be 
decoded and recombined to reconstruct (an approximate version of) the 
input signal. Such processing allows, on average, a 12-to-1 reduction in bit 
rate while still maintaining "CD quality" audio. The psychoacoustic model 
takes into account the spectral masking phenomenon of the human ear, 
which says that high energy in one spectral region will limit the ear's ability 
to hear details in nearby spectral regions. Therefore, when the energy in one 
sub-band is high, nearby sub-bands can be coded with less bits without 
degrading the perceived quality of the audio signal. The MPEG standard 
specifies 32-channels of sub-band filtering. Some psychoacoustic models 
also take into account "temporal masking" properties of the human ear, 
which say that a loud burst of sound will temporarily overload the ear for 
short time durations, making it possible to hide quantization noise in the 
time interval after a loud sound burst. 


In typical applications, non-trivial signal processing takes place between the 
bank of analysis filters and the bank of synthesis filters, as shown in [link]. 
We will focus, however, on filterbank design rather than on the processing 
that occurs between the filterbanks. 


a{n) 


Our goals in filter design are: 


1. Good sub-band frequency separation (i.e., good "frequency 
selectivity"). 

2. Good reconstruction (i.e., y[n| ~ x|n — d] for some integer delay d) 
when the sub-band processing is lossless. 


The first goal is driven by the assumption that the sub-band processing 
works best when it is given access to cleanly separated sub-band signals, 
while the second goal is motivated by the idea that the sub-band filtering 
should not limit the reconstruction performance when the sub-band 
processing (e.g., the coding/decoding) is lossless or nearly lossless. 


Uniform Filterbanks 


Uniform Filterbanks 


With M uniformly spaced sub-bands, the sub-band width is a radians, 
implying that the sub-band signal can be downsampled by a factor M/ (but 
not more than M) without loss of information. This is referred to as a 
"critically sampled" filterbank. This maximal level of downsampling is 
advantageous when storing or further processing the sub-band signals. With 
critical sampling, the total number of downsampled sub-band output 
samples equals the total number of input samples. Assuming lossless sub- 
band processing, the critically-sampled synthesis/analysis procedure is 
illustrated in [link]: 


Recall that one of our goals in filter design is to ensure that 

y[n] ~ a|n — d] for some integer delay d. From the block diagram above, 
one can see that imperfect analysis filtering will contribute aliasing errors to 
the sub-band signals. This aliasing distortion will degrade y[n] if it is not 
cancelled by the synthesis filterbank. Though ideal brick-wall filters H;,(z) 
and G,(z) could easily provide perfect reconstruction (i.e., 

y|n] = a[n — d)), they would be unimplementable due to their doubly- 
infinite impulse responses. Thus, we are interested in the design of causal 
FIR filters that give near-perfect reconstruction or, if possible, perfect 
reconstruction. 


There are two principle approaches to the design of filterbanks: 


1. Classical: Approximate ideal brick wall filters to ensure good sub-band 
isolation (i.e., frequency selectivity) and accept (a hopefully small 
amount of) aliasing and thus reconstruction error. 

2. Modern: Constrain the filters to give perfect (or near-perfect) 
reconstruction and hope for good sub-band isolation. 


Uniform Modulated Filterbank 
A modulated filterbank is composed of analysis branches which: 


1. Modulate the input to center the desired sub-band at DC 
2. Lowpass filter the modulated signal to isolate the desired sub-band 
3. Downsample the lowpass signal 


The synthesis branches interpolate the sub-band signals by upsampling and 
lowpass filtering, then modulate each sub-band back to its original spectral 
location (see [link]). 


In M-branch critically-sampled uniformly-modulated filterbanks, the kth 
branch has a modulation frequency of +(w,) = 34k radians, a lowpass 
cutoff frequency of +; radians, and a downsampling/upsampling factor of 
M. 


The most common way to implement this type of uniform modulated 
filterbank is through the use of polyphase filterbanks and DFTs. 


Uniformally Modulated (DFT) Filterbank 
This module covers the Uniformally Modulated Filterbanks. 


The uniform modulated filterbank can be implemented using polyphase 
filterbanks and DFTs, resulting in huge computational savings. [link] below 
illustrates the equivalent polyphase/DFT structures for analysis and synthesis. 
The impulse responses of the polyphase filters P;(z) and P;(z) can be defined in 
the time domain as p;|m| = h[mM + I] and pj(m) = h[mM + J], where h[n!] 
and h|n]| denote the impulse responses of the analysis and synthesis lowpass 
filters, respectively. 
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Recall that the standard implementation performs modulation, filtering, and 
downsampling, in that order. The polyphase/DFT implementation reverses the 
order of these operations; it performs downsampling, then filtering, then 
modulation (if we interpret the DFT as a two-dimensional bank of "modulators"). 
We derive the polyphase/DFT implementation below, starting with the standard 
implementation and exchanging the order of modulation, filtering, and 
downsampling. 


Polyphase/DFT Implementation Derivation 


We start by analyzing the kth filterbank branch, analyzed in [Link]: 
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kth filterbank branch 


The first step is to reverse the modulation and filtering operations. To do this, we 
define a "modulated filter" H;,(z): 
Equation: 
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x(n] is convolved with the modulated filter and that the filter output is 
modulated. This is illustrated in [link]: 
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Notice that the only modulator outputs not discarded by the downsampler are 
those with time index n = mM for m € Z. For these outputs, the modulator has 
the value ef 17M — 1, and thus it can be ignored. The resulting system is 
portrayed by: 


zfn]—e] Hy(2) |x| Lat —e yi 


Next we would like to reverse the order of filtering and downsampling. To apply 
the Noble identity, we must decompose H;,(z) into a bank of upsampled 
polyphase filters. The technique used to derive polyphase decimation can be 
employed here: 

Equation: 


Ay(z) = Yina-oo heln|z™ 
= YD Ue .- hal mM + 2) 


mMm=—Co 


Noting the fact that the /th polyphase filter has impulse response: 
hy[mM + 1] = h[mM + lef M+) — hlmM + e430" = p[mle ar 


where p;|m] is the [th polyphase filter defined by the original (unmodulated) 
lowpass filter H(z), we obtain 
Equation: 
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kth filterbank branch (now containing M polyphase branches) is in [link]: 
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kth filterbank branch containing M/ polyphase 
branches. 


Because it is a linear operator, the downsampler can be moved through the adders 
and the (time-invariant) scalings e(~4)ir*!, Finally, the Noble identity is 
employed to exchange the filtering and downsampling. The kth filterbank branch 
becomes: 
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Observe that the polyphase outputs {v,|m], v;[m]|l = {0,..., M —1}} are 
identical for each filterbank branch, while the scalings 


{e(-FH l= {0,...,M— iy} once. Using these outputs we can compute the 


branch outputs via 
Equation: 


From the previous equation it is clear that y,{m] corresponds to the kth DFT 
output given the M-point input sequence {v;|m]|1 = {0,..., M@ — 1}}. Thus the 
M filterbank branches can be computed in parallel by taking an M-point DFT of 
the / polyphase outputs (see [link]). 


Py (z) 


The polyphase/DFT synthesis bank can be derived in a similar manner. 


Computational Savings of Polyphase/DFT Filterbanks 

This module will briefly take a look a the computational savings of the polyphase/DFT modulated filterbank 
implementation. We will start by looking at our standard analysis bank and then move on to compare this with our 
other implementation approaches. 


Assume that the lowpass filter in the standard analysis bank, H(z), has impulse response length N. To calculate 


the sub-band output vector {y;[m], yx[m]| k = {0,..., 1 — 1}} using the standard structure, we have 
Equation: 
decimator outputs filter outputs multiplications multiplications 
asda | gee E (Ne) = M(N+1)——~ 
vector decimator outputs filter output vector 


where we have included one multiply for the modulator. The calculations above pertain to standard (i.e., not 
polyphase) decimation. If we implement the lowpass/downsampler in each filterbank branch with a polyphase 
decimator, 

Equation: 


multiplications multiplications 


N+1 =M(N+1 
aa ) branch output ( ) 


branch outputs 


vector vector 


To calculate the same output vector for the polyphase/DFT structure, we have approximately 
Equation: 


1 


DFT . ( = 16 yy multiplications . us Polyphase outputs N multiplications ) 
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The table below gives some typical numbers. Recall that the filter length N will be linearly proportional to the 
decimation factor M, so that the ratio — determines the passband and stopband performance. 


= N _ = ANE AS 
M = 32, =, = 10 M = 128, =, = 10 
standard 328,704 20,987,904 
standard with polyphase 10,272 163,968 


polyphase/DFT 400 1,728 


Drawbacks of Classical Filterbank Designs 


To begin with, the reference to "classical" filterbank designs, generally 
refers to the filterbank types that you should have seen up to now. These 
include the following types, that can be reviewed if necessary: 


¢ Uniform Modulated Filterbanks 
¢ Polyphase/DFT Implementation of Uniform Modulated Filterbanks 


Drawbacks to Classical Implementation 


The classical filterbanks that we have considered so far (those listed above) 
give perfect reconstruction performance only when the analysis and 
synthesis filters are ideal. With non-ideal (i.e., implementable) filters, 
aliasing will result from the downsampling/upsampling operation and 
corrupt the output signal. Since aliasing distortion is inherently non-linear, 
it may be very undesirable in certain applications. Thus, long 
analysis/synthesis filters might be required to force aliasing distortion down 
to tolerable levels. The cost of long filters is somewhat offset by the 
efficient polyphase implementation, though. 


That said, clever fliter designs have been proposed which prevent aliasing 
in neighboring sub-bands. These designs include the following references: 
Rothweiler, Crochiere and Rabiner, and Vaidyanathan. As neighboring- 
subband aliasing typically constitutes the bulk of aliasing distortion, these 
designs give significant performance gains. In fact, such filter designs are 
used in MPEG high-performance audio compression standards. 


Aliasing-Cancellation Conditions of Filterbanks 
This module will look at methods and examples of aliasing-cancellation 
conditions. 


Introduction 


It is possible to design combinations of analysis and synthesis filters such 
that the aliasing from downsampling/upsampling is completely cancelled. 
Below we derive aliasing-cancellation conditions for two-channel 
filterbanks. Though the results can be extended to M-channel filterbanks in 
a rather straightforward manner, the two-channel case offers a more lucid 
explanation of the principle ideas (see [link]). 
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Aliasing Cancellation Conditions 
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The aliasing cancellation conditions follow directly from the input/output 
equations derived below. Let i € {0,1} denote the filterbank branch index. 
Then 

Equation: 


Equation: 
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Ao(z)  Ai(z) 
Ho(—z) Hi(-z) 
component matrix. For aliasing cancellation, we need to ensure that 
X (—z) does not contribute to the output Y(z). This requires that 


where H(z) = ( ’ . H(z) is often called the aliasing 
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which is guaranteed by 
Equation: 


or by the following pair of conditions for any rational C'(z) 
Go(z) = C(z)Hi(—z) 
Gi(z) = (—C(z))Ho(—z) 
Under these aliasing-cancellation conditions, we get the input/output 


relation 
Equation: 


where T(z) = + (Ho(z)Hi(—z) — Hi(z)Ho(—z))C(z) represents the 
system transfer function. We say that "perfect reconstruction" results when 
y|n] = 2[n — 1] for some 1 € N, or equivalently when T(z) = z!. 


Note: The aliasing-cancellation conditions remove one degree of freedom 
from our filterbank design; originally, we had the choice of four transfer 
functions {Ho(z), Hi(z), Go(z), Gi(z)}, whereas now we can choose 
three: {Ho(z), Hi(z), C(z)}. 


Two-Branch Quadvalue Mirror Filterbank (QMF) 


Quadrature Mirror Filterbanks 


The quadrature mirror filterbank (QMF) is an aliasing-cancellation 
filterbank with the additional design choices: 


Combining the various design rules, it is easy to see that all filters will be 
causal, real-coefficient, and FIR. The QMF choices yield the system 
transfer function 

Equation: 


T(z) = H(z) — Hy(z) 


The name "QME" is appropriate for the following reason. Note that 
pee) le) =e) 


where the last step follows from the DIFT conjugate-symmetry of real- 
coefficient filters. This implies that the magnitude responses |H 0 (e™) | and 


| H;(e”) | from a mirror-image pair, symmetric around w = = = 22 (the 


7 — 4A 
"quadrature frequency"), as illustrated in [link]. 
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The QMF design rules imply that all filters in the bank are directly related 
to the "prototype" filter Ho(z), and thus we might suspect a polyphase 
implementation. In fact, one exists. Using the standard polyphase 


decomposition of Ho(z), we have 
Equation: 


H(z) = Po(z*) + 27" P, (27) 


so that 
H(z) = Ho(—z) = Po(z’) — 21 P,(2’) 
Go(z) = 2H)(—z) = 2P9(z*) + 22" P, (2) 
Gi(z) = —2Ho(—z) = 2Po(z’) + 22 P,(z’) 


Application of the Noble identity results in the polyphase structure in [link]: 


The QMF choice C(z) = 2 implies that the synthesis filters have twice the 
DC gain of the corresponding analysis filters. Recalling that decimation by 
2 involves anti-alias lowpass filtering with DC gain equal to one, while 


interpolation by 2 involves anti-image lowpass filtering with DC gain equal 
to 2, [link] suggests an explanation for the choice C(z) = 2. 


Perfect Reconstruction QMF 


Perfect Reconstruction QMF 


The system transfer function for a QMF bank is 
Equation: 


T(z) 


H?(z) — Hy*(z) 
= 4271P)(2*)P,(z*) 


For perfect reconstruction, we need T(z) = z ! for some I € N, which 
implies the equivalent conditions 


Po(z*) Pi (2) = a 
1 
Po(z)Pi(z) = 7 


For FIR polyphase filters, this can only be satisfied by 
Po(z) = Boz ™ 
P,(z) = Biz ™ 

where we have nj + ny = — and 696, = +. 


In other words, the polyphase filters are trivial, so that the prototype filter 
H(z) has a two-tap response. With only two taps, Ho(z) cannot be a very 
good lowpass filter, meaning that the sub-band signals will not be spectrally 
well-separated. From this we conclude that two-channel [footnote] perfect 
reconstruction QMF banks exist but are not very useful. 

It turns out that 1/-channel perfect reconstruction QMF banks have more 
useful responses for larger values of M. 


Johnston's QMF Banks 


Two-channel perfect-reconstruction QME banks are not very useful because the 
analysis filters have poor frequency selectivity. The selectivity characteristics can be 
improved, however, if we allow the system response T’ (ec) to have magnitude- 
response ripples while keeping its linear phase. 


Say that Ho(z) is causal, linear-phase, and has impulse response length N. Then it 
is possible to write Ho (e) in terms of a real-valued zero-phase response Ho (e) , 


so that 
Equation: 


Equation: 
T(e”) = Hy’ (e”) _ Hy" (et) 
_ e(-)4N-1) ff,” (et) _ e(-)(e-)(N-1) Fy (etle-7)) 
—  ¢(-iw(N-1) (Ho (c*) _ eit) Hy (ce) ) 


Note that if NV is odd, et(N-1) — i. 
Equation: 


A null in the system response would be very undesirable, and so we restrict N to be 
an even number. In that case, 
Equation: 
; oD ade 
T (e) = e(-)w(N-1) (He (e”) + Hy (ee) ) 


= ef 4) (([Ho(e%)])? + (Hole) /)") 


Note: The system response is linear phase, but will have amplitude distortion if 
(| Ho (e”) ))’ == (| Ho (et) ) ” is not equal to a constant. 


Johnston's idea was to assign a cost function that penalizes deviation from perfect 
reconstruction as well as deviation from an ideal lowpass filter with cutoff wo. 
Specifically, real symmetric coefficients ho|n] are chosen to minimize 

Equation: 


ie af (|Ho(e)|)? dw — ifr ~ (|Ho(e)))? — (|Ho(e) |)" dw 


0 


where 0 < A < 1 balances between the two conflicting objectives. Numerical 
optimization techniques can be used to determine the coefficients, and a number of 


popular coefficient sets have been tabulated. (See Crochiere and Rabiner, Johnston, 
and Ansari and Liu) 


Example: 
"12B" Filter 
As an example, consider the "12B" filter from Johnston: 


ho|[0] = —0.006443977 = ho[11] 
ho[1] = 0.02745539 = Ao[10] 
ho|2] = —0.00758164 = ho|9} 
ho[3] = —0.0913825 = ho[8] 
ho|[4] = 0.09808522 = h,(7] 
ho[B] = 0.4807962 = ho|6| 


which gives the following DTFT magnitudes ([link]). 


analysis titers 


The top plot shows the analysis filters and the 
bottom one shows the system response. 


Perfect Reconstruction FIR Filter Banks 


FIR Perfect-Reconstruction Conditions 


The QME design choices prevented the design of a useful (i.e., frequency 
selective) perfect-reconstruction (PR) FIR filterbank. This motivates us to 
re-examine PR filterbank design without the overly-restrictive QMF 
conditions. However, we will still require causal FIR filters with real 
coefficients. 


Recalling that the two-channel filterbank ([Link]), 


has the input/output relation: 
Equation: 


¥(2) = 5 (X(2) X(-2)) ( ae 7 (este) 


we see that the delay-/ perfect reconstruction requires 


Equation: 
(0 )= (nics mes) (ee) 
where 


won (Ei, 2) 


or, equivalently, that 
Equation: 


(aa) = Pe( ) 


where 
Equation: 


det H(z) = Ho(z)Mi(—z) — Ho(—z)Ai(z) 


For FIR Go(z) and G1(z), we require [footnote] that 
Equation: 


det H(z) =cz* 


forc € Randk € Z. Under this determinant condition, we find that 


Equation: 
(Gi) =e (cata) 


Assuming that Ho(z) and H,(z) are causal with non-zero initial coefficient, 
we choose k = | to keep Go(z) and G;(z) causal and free of unnecessary 
delay. 

Since we cannot assume that FIR Ho(z) and H(z) share a common root. 


Summary of Two-Channel FIR-PR Conditions 
Summarizing the two-channel FIR-PR conditions: 


Ho(z) A H(z) = causal real-coefficient FIR 
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Orthogonal Perfect Reconstruction FIR Filterbank 


Orthogonal PR Filterbanks 


The FIR perfect-reconstruction (PR) conditions leave some freedom in the 
choice of Ho(z) and H;(z). Orthogonal PR filterbanks are defined by 
causal real-coefficient even-length-N analysis filters that satisfy the 
following two equations: 

Equation: 


1= H(z) Ho(z*) - H(-z)Ho(—z") 
Equation: 


Hy(z) = (+(z)) “Ho (-27') 


To verify that these design choices satisfy the FIR-PR requirements for 
H(z) and H;(z), we evaluate det H(z) under the second condition above. 
This yields 

Equation: 


det H(z) = +(Ho(z)Hi(z') — Ho(-z)Ai(z)) 
(—2 4) (Ho(z)Ho(z*) + H(-z)Ho(—z')) 


zg (N-1) 


which corresponds to c = —1 and! = N — 1 in the FIR-PR determinant 
condition det H(z) = cz~!. The remaining FIR-PR conditions then imply 
that the synthesis filters are given by 
Equation: 
Go(z) = -—2H, ag, 
22-(N-V) Hy(27) 


Equation: 


Gi(z) = 2Ho(—z) 
= 22° D (271) 


The orthogonal PR design rules imply that Ho (e”) is "power symmetric" 
and that { Ho Car Ay (e”) } form a "power complementary" pair. To see 
the power symmetry, we rewrite the first design rule using z = e and 
—1 = e*"", which gives 

Equation: 


1 Hp (e’) Hp (e@)) + Ho (e-")) Ho (ek UG) 
ene fg a AND 
= (|Ho(e)|) + (Hole ™))) 


(|Ho(e) |)” + (|o(e") |)” 


The last two steps leveraged the fact that the DIFT of a real-coefficient 
filter is conjugate-symmetric. The power-symmetry property is illustrated in 
[link]: 
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Power complementarity follows from the second orthogonal PR design rule, 
which implies | H;(e)| = |Ho(e*) |. Plugging this into the previous 
equation, we find 

Equation: 


1 = (|Ho(e)|)? + (|e(e*)])? 


The power-complimentary property is illustrated in [link]: 
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Design of Orthogonal PR-FIR Filterbanks via Halfband Spectral 
Factorization 


Design of Orthogonal PR-FIR Filterbanks via Halfband 
Spectral Factorization 


Recall that analysis-filter design for orthogonal PR-FIR filterbanks reduces 
to the design of a real-coefficient causal FIR prototype filter Ho(z) that 
satisfies the power-symmetry condition 

Equation: 


2 


( Hole) )? + (ato(e™) J = 


Power-symmetric filters are closely related to "halfband" filters. A zero- 
phase halfband filter is a zero-phase filter F(z) with the property 
Equation: 


F(z) + F(-z) =1 


When, in addition, F(z) has real-valued coefficients, its DTFT is 
"amplitude-symmetric": 
Equation: 


F(e) ++ F (eit) =1 


The amplitude-symmetry property is illustrated in [link]: 


If, in addition to being real-valued, [footnote] F’ (ec) is non-negative, then 
F (e) constitutes a valid power response. If we can find H(z) such that 


( Ho (e™) )? = F(e'), then this Ho(z) will satisfy the desired power- 


symmetry property ( Ho (eo) )? + ( Ho (et("-+)) : — 1, 
Recall that zero-phase filters have real-valued DTFTs. 


First, realize F’ (ec) is easily modified to ensure non-negativity: construct 
q\n] = f[n] + €6[n] for sufficiently large €, which will raise F(e™’) by ¢ 
uniformly over w (see [link]). 


Qe) —_«_— Qe) 


The resulting Q(z) is non- oe and satisfies the amplitude-symmetry 
condition Q(e ay = Q(ei Hye = 1] + 2e. We will make up for the 
additional gain later. The procedure by which Ho(z) can be calculated from 
the raised halfband Q(z), known as spectral factorization, is described 
next. 


Since q[n]| is conjugate-symmetric around the origin, the roots of Q(z) 


come in pairs { a,, +. . This can be seen by writing Q(z) in the factored 


form below, which clearly corresponds to a polynomial with coefficients 
conjugate-symmetric around the 0'"-order coefficient. 


Equation: 


Q(z) 


aaa are 
AyIx' (1 — ajz~") (1 — ajz) 


where A € R*. Note that the complex numbers { ai, i} are symmetric 


across the unit circle in the z-plane. Thus, for ever root of Q(z) inside the 
unit-circle, there exists a root outside of the unit circle (see [link]). 


Let us assume, without loss of generality, that |a;| < 1. If we form H(z) 
from the roots of Q(z) with magnitude less than one: 
Equation: 


N-1 
Hy(z)=VA Il 1—a,;z! 
i=1 


then it is apparent that ( Ho (e™) a = Q(e™). This Ho(z) is the so- 
called minimum-phase spectral factor of Q(z). 


Actually, in order to make ( Ao (ec) )? = () (eo), we are not required to 
choose all roots inside the unit circle; it is enough to choose one root from 
every unit-circle-symmetric pair. However, we do want to ensure that 
H(z) has real-valued coefficients. For this, we must ensure that roots 
come in conjugate-symmetric pairs, i.e., pairs having symmetry with 
respect to the real axis in the complex plane ((link]). 


Because Q(z) has real-valued coefficients, we know that its roots satisfy 
this conjugate-symmetry property. Then forming Ho(z) from the roots of 
Q(z) that are strictly inside (or strictly outside) the unit circle, we ensure 
that Ho(z) has real-valued coefficients. 


Finally, we say a few words about the design of the halfband filter F(z). 
The window design method is one technique that could be used in this 
application. The window design method starts with an ideal lowpass filter, 
and windows its doubly-infinite impulse response using a window function 
with finite time-support. The ideal real-valued zero-phase halfband filter 
has impulse response (where n € Z): 

Equation: 


which has the important property that all even-indexed coefficients except 
f [0] equal zero. It can be seen that this latter property is implied by the 
halfband definition F(z) + F'(z~1) = 1 since, due to odd-coefficient 


cancellation, we find 
Equation: 


1 = F(z)+ F(z") 
202 f [2m]z-O™) ef (2m) = Lal 


Note that windowing the ideal halfband does not alter the property 
f (2m) = +6[m], thus the window design F(z) is guaranteed to be 
halfband feature. Furthermore, a real-valued window with origin-symmetry 


preserves the real-valued zero-phase property of { f [nl \ above. It turns out 


that many of the other popular design methods (e.g., LS and equiripple) also 
produce halfband filters when the cutoff is specified at + radians and all 


passband/stopband specifications are symmetric with respect to w = +. 


Design Procedure Summary 


We now summarize the design procedure for a length-NV analysis lowpass 
filter for an orthogonal perfect-reconstruction FIR filterbank: 


1. Design a zero-phase real-coefficient halfband lowpass filter 
F(z) = pan n-1) f[n|z"” where N is a positive even integer (via, 
e.g., window designs, LS, or equiripple). 

2. Calculate ¢, the maximum negative value of F’ (ce). (Recall that 
F (ec) is real-valued for all w because it has a zero-phase response.) 
Then create "raised halfband" Q(z) via q[n] = f[n] + €6|n], ensuring 
that Oe) > 0, forall w. 

3. Compute the roots of Q(z), which should come in unit-circle- 


symmetric pairs {ai, _. \ Then collect the roots with magnitude less 


than one into filter H(z). 


4. H, 0(z) is the desired prototype filter except for a scale factor. Recall 
that we desire 


( Hole) J? + ( Ho(e"") )°=1 


Using Parseval's Theorem, we see that {ho in) should be scaled to 
give {ho[n]} for which oo ho?[n] = 5. 


Bi-Orthogonal Perfect Reconstruction FIR Filterbanks 


Bi-Orthogonal Filter Banks 


Due to the minimum-phase spectral factorization, orthogonal PR-FIR 
filterbanks will not have linear-phase analysis and synthesis filters. Non- 
linear phase may be undesirable for certain applications. "Bi-orthogonal" 
designs are closely related to orthogonal designs, yet give linear-phase 
filters. The analysis-filter design rules for the bi-orthogonal case are 


F(z) : zero- me real-coefficient halfband such that 
F(z) = 3 n-1) f[n|z, where N is even. 
WD F(2) = Hole) Ey (-2) 


It is straightforward to verify that these design choices satisfy the FIR 
perfect reconstruction condition det H(z) = cz~! with c = 1 and 
l[=N-—1: 
Equation: 

det H(z) = HAo(z)Ai(—z) — Ho(—z)A1(z) 
ad Gg AA ee) 
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Furthermore, note that z~("~1) F(z) is causal with real coefficients, so that 
both Ho(z) and H;(z) can be made causal with real coefficients. (This was 
another PR-FIR requirement.) The choice c = 1 implies that the synthesis 
filters should obey 


Go(z) = 2H, (—z) 
Gi(z) = —2Ho(—z) 


From the design choices above, we can see that bi-orthogonal analysis filter 
design reduces to the factorization of a causal halfband filter z~“"~) F(z) 


into Ho(z) and H;(z) that have both real coefficients and linear-phase. 
Earlier we saw that linear-phase corresponds to root symmetry across the 
unit circle in the complex plane, and that real-coefficients correspond to 
complex-conjugate root symmetry. Simultaneous satisfaction of these two 
properties can be accomplished by quadruples of roots. However, there are 
special cases in which a root pair, or even a single root, can simultaneously 
satisfy these properties. Examples are illustrated in [link]: 


The design procedure for the analysis filters of a bi-orthogonal perfect- 
reconstruction FIR filterbank is summarized below: 


1. Design a zero-phase real-coefficient filter 
F(z) =>” = n-1 f [n|z—” where N is a positive even integer (via, 


e.g., window designs, LS, or equiripple). 

2. Compute the roots of F(z) and partition into a set of root groups 
{Go, Gi, Go, ...} that have both complex-conjugate and unit-circle 
symmetries. Thus a root group may have one of the following forms: 


1 1 
G; _ {a a) =} 
ay a; 


Vai, Ja; =i (G; — VO Ga}) 


Vaj;,a; ER: @ = {4 ~}) 
a; 


Va;,a; = (1) : (G; = {a;}) 


Choose [footnote] a subset of root groups and construct H(z) from 


those roots. Then construct H,(—z) from the roots in the remaining 
root groups. Finally, construct H(z) from H,(—z) by reversing the 
signs of odd-indexed coefficients. 

Note that H(z) and H(z) will be real-coefficient linear-phase 
regardless of which groups are allocated to which filter. Their 
frequency selectivity, however, will be strongly influenced by group 
allocation. Thus, you many need to experiment with different 
allocations to find the best highpass/lowpass combination. Note also 
that the length of Ho(z) may differ from the length of Ho(z). 


. H(z) and H(z) are the desired analysis filters up to a scaling. To 
take care of the scaling, first create Ho(z) = aHo(z) and 

A, (z) = bHy(z) where a and b are selected so that 

S~, ho[n] = 1 = So, h(n]. Then create Ho(z) = cHo(z) and 
H(z) = cH(z) where c is selected so that the property 


zW-) F(z) = Ho(z)Hi(—z) 


is satisfied at DC (i.e., z = et) — 1). In other words, find c so that 


Yin holn] dom Paln|(—1)™ = 1. 


Filterbanks with >2 Branches 


Filterbanks with >2 Branches 


Thus far the previous discussion on filterbanks has concentrated on 
"modern" filterbanks with only two branches. There are two standard ways 
by which the number of branches can be increased. 


M-Band Filterbanks 


The ideas used to construct two-branch PR-FIR filterbanks can be directly 
extended to the M/-branch case. (See Vaidyanathan and Mitra) This yields, 
for example, a polynomial matrix H(z) with M rows and M columns. For 
these /-band filterbanks, the sub-bands will have uniform widths on 
radians (in the ideal case) [link]. 


M-band Filterbank 


Multi-Level (Cascade) Filterbanks 


The two-branch PR-FIR filterbanks can be cascaded to yield PR-FIR 
filterbanks whose sub-band widths equal 2~*z for non-negative integers k 
(in the ideal case). If the magnitude responses of the filters are not well 
behaved, however, the cascading will result in poor effective frequency- 
selectivity. Below we show a filterbank in which the low-frequency sub- 
bands are narrower than the high-frequency sub-band. Note that the number 
of input samples equals the total number of sub-band samples. 


Multi-level (Cascaded) Filterbank 


We shall see these structures in the context of the discrete wavelet 
transform. 


Why Transforms? 


In the field of signal processing we frequently encounter the use of 
transforms. Transforms are named such because they take a signal and 
transform it into another signal, hopefully one which is easier to process or 
analyze than the original. Essentially, transforms are used to manipulate 
signals such that their most important characteristics are made plainly 
evident. To isolate a signal's important characteristics, however, one must 
employ a transform that is well matched to that signal. For example, the 
Fourier transform, while well matched to certain classes of signal, does not 
efficiently extract information about signals in other classes. This latter fact 
motivates our development of the wavelet transform. 


Limitations of Fourier Analysis 


Let's consider the Continuous-Time Fourier Transform (CTFT) pair: 


X(2) = - a(t)e “) dt 


1 re ; 
x(t) = = / x(Mei* a2 


The Fourier transform pair supplies us with our notion of "frequency." In 
other words, all of our intuitions regarding the relationship between the 
time domain and the frequency domain can be traced to this particular 
transform pair. 


It will be useful to view the CTFT in terms of basis elements. The inverse 
CTFT equation above says that the time-domain signal x(t) can be 
expressed as a weighted summation of basis elements 

{bp(t), be(t)| — co < Q < oo}, where b(t) = e™* is the basis element 
corresponding to frequency §2. In other words, the basis elements are 
parameterized by the variable 2 that we call frequency. Finally, X(.2) 
specifies the weighting coefficient for b(t). In the case of the CTFT, the 
number of basis elements is uncountably infinite, and thus we need an 
integral to express the summation. 


The Fourier Series (FS) can be considered as a special sub-case of the 
CTFT that applies when the time-domain signal is periodic. Recall that if 
z(t) is periodic with period T,, then it can be expressed as a weighted 


summation of basis elements {b;(t)}|?°_., where b;,(t) = ei tk, 


x(t) = 3 X{[kletr 


k=—0o 


Here the basis elements comes from a countably-infinite set, parameterized 
by the frequency index k € Z. The coefficients {X|k]}|°° . specify the 


strength of the corresponding basis elements within signal x(t). 


Though quite popular, Fourier analysis is not always the best tool to analyze 
a signal whose characteristics vary with time. For example, consider a 
signal composed of a periodic component plus a sharp "glitch" at time to, 
illustrated in time- and frequency-domains, [link]. 


z(t) |X(Q)| 
& 4 
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Thr tL. 
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Fourier analysis is successful in reducing the complicated-looking periodic 
component into a few simple parameters: the frequencies {.2,, 22} and 
their corresponding magnitudes/phases. The glitch component, described 
compactly in terms of the time-domain location tg and amplitude, however, 
is not described efficiently in the frequency domain since it produces a wide 
spread of frequency components. Thus, neither time- nor frequency-domain 
representations alone give an efficient description of the glitched periodic 
signal: each representation distills only certain aspects of the signal. 


As another example, consider the linear chirp x(t) = sin (,2t”) illustrated 
in [Link]. 
a(t) 
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Though written using the sin(-) function, the chirp is not described by a 
single Fourier frequency. We might try to be clever and write 


sin(2t?) = sin(2t-t) = sin(Q(t)-t) 


where it now seems that signal has an instantaneous frequency 

(Q(t) = Mt which grows linearly in time. But here we must be cautious! 
Our newly-defined instantaneous frequency §2(t) is not consistent with the 
Fourier notion of frequency. Recall that the CTFT says that a signal can be 
constructed as a superposition of fixed-frequency basis elements e“* with 
time support from —oo to +(co); these elements are evenly spread out over 
all time, and so there is noting instantaneous about Fourier frequency! So, 
while instantaneous frequency gives a compact description of the linear 
chirp, Fourier analysis is not capable of uncovering this simple structure. 


As a third example, consider a sinusoid of frequency {29 that is 
rectangularly windowed to extract only one period ([link]). 
x(t) |X (2) 


Instantaneous-frequency arguments would claim that 


Qo if t € window : 
V0) = ‘i cde wuilas - (a(t) = sin(2(t)-t)) 
where {2(t) takes on exactly two distinct "frequency" values. In contrast, 
Fourier theory says that rectangular windowing induces a frequency- 
domain spreading by a ut profile, resulting in a continuum of Fourier 
frequency components. Here again we see that Fourier analysis does not 
efficiently decompose signals whose "instantaneous frequency" varies with 
time. 


Time-Frequency Uncertainty Principle 


Recall that Fourier basis elements b(t) = e’”* exhibit poor time 
localization abilities - a consequence of the fact that b(t) is evenly spread 
over all t € (—oo, co). By time localization we mean the ability to clearly 
identify signal events which manifest during a short time interval, such as 
the "glitch" described in an earlier example. 


At the opposite extreme, a basis composed of shifted Dirac deltas 
b(t) = A(t — T) would have excellent time localization but terrible 
"frequency localization," since every Dirac basis element is evenly spread 
over all Fourier frequencies 2 € [—00, oo]. This can be seen via 

Q)|= See b (te) d t| = 1 V(2), regardless of 7. By 
frequency localization we mean the ability to clearly identify signal 
components which are concentrated at particular Fourier frequencies, such 
as sinusoids. 


These observations motivate the question: does there exist a basis that 
provides both excellent frequency localization and excellent time 
localization? The answer is "not really": there is a fundamental tradeoff 
between the time localization and frequency localization of any basis 
element. This idea is made concrete below. 


Let us consider an arbitrary waveform, or basis element, b(t). Its CTFT will 
be denoted by B(2). Define the energy of the waveform to be F, so that 
(by Parseval's theorem) 


B= f (oo) * (|B(Q))? 4a 


On = 


Next, define the temporal and spectral centers [footnote] as 


1 iad 2 
t= Tf e«oOD 


1 ~ 2 
Oe rane A(|B(Q)|)2 da 


and the temporal and spectral widths [footnote] as 


A= y/= f (-t)%((o@)? at 


Ao = i [ (2-2) Ba)? 4.2 


If the waveform is well-localized in time, then b(¢) will be concentrated at 
the point t, and A, will be small. If the waveform is well-localized in 
frequency, then B(2) will be concentrated at the point 2, and Ag will be 
small. If the waveform is well-localized in both time and frequency, then 
A;Ag will be small. The quantity A; Ay is known as the time-bandwidth 
product. 


It may be interesting to note that both = (|b(t) |)? and op (|B(2)| )? are 


non-negative and integrate to one, thereby satisfying the requirements of 
probability density functions for random variables ¢ and {2. The 
temporal/spectral centers can then be interpreted as the means (i.e., centers 
of mass) of t and 2. 

The quantities A,” and Ag? can be interpreted as the variances of t and 2, 
respectively. 


From the definitions above one can derive the fundamental properties 
below. When interpreting the properties, it helps to think of the waveform 
b(t) as a prototype that can be used to generate an entire basis set. For 
example, the Fourier basis {b(t), ba(t)| — co < 2 < co} canbe 
generated by frequency shifts of b(t) = 1, while the Dirac basis 

{b,(t)| — co < tT < oo}b,(t) can be generated by time shifts of 

b(t) = 6(t) 


1. A; and Ag are invariant to time and frequency [footnote] shifts. 


Vto € R: (A¢(b(¢)) = Ar (b(t — to))) 


VQ €R: (Ap(B(2)) = Ag(B(2 — 2))) 


This implies that all basis elements constructed from time and/or 
frequency shifts of a prototype waveform 6(¢) will inherit the temporal 
and spectral widths of b(t). 

Keep in mind the fact that b(t) and B(Q) = f™ b(t)e “) dt are 
alternate descriptions of the same waveform; we could have written 
An(b(t)e**) in place of A(B(2 — 2)) above. 

. The time-bandwidth product A; 4 is invariant to time-scaling. 
[footnote] 


The above two equations imply 


Va ER: (AyAg(b(at)) = A¢Ag(d(t))) 


Observe that time-domain expansion (i.e., |a| < 1) increases the 
temporal width but decreases the spectral width, while time-domain 
contraction (i.e., |a| > 1) does the opposite. This suggests that time- 
scaling might be a useful tool for the design of a basis element with a 
particular tradeoff between time and frequency resolution. On the other 
hand, scaling cannot simultaneously increase both time and frequency 
resolution. 

The invariance property holds also for frequency scaling, as implied by 
the Fourier transform property b(at) @ +B (2 " 


|a| 


. No waveform can have time-bandwidth product less than +: 
1 
A: An Z iy 


This is known as the time-frequency uncertainty principle. 
. The Gaussian pulse g(t) achieves the minimum time-bandwidth 
product A;Ay = + 


G(Q) =e) 


Note that this waveform is neither bandlimited nor time-limited, but 
reasonable concentrated in both domains (around the points ¢, = 0 and 
42;-=0), 


Properties 1 and 2 can be easily verified using the definitions above. 
Properties 3 and 4 follow from the Cauchy-Schwarz inequality. 


Since the Gaussian pulse g(t) achieves the minimum time-bandwidth 
product, it makes for a theoretically good prototype waveform. In other 
words, we might consider constructing a basis from time shifted, frequency 
shifted, time scaled, or frequency scaled versions of g(t) to give a range of 
spectral/temporal centers and spectral/temporal resolutions. Since the 
Gaussian pulse has doubly-infinite time-support, though, other windows are 
used in practice. Basis construction from a prototype waveform is the main 
concept behind Short-Time Fourier Analysis and the continuous Wavelet 
transform discussed later. 


Short-time Fourier Transform 
This module introduces short-time Fourier transform. 


We saw earlier that Fourier analysis is not well suited to describing local 
changes in "frequency content" because the frequency components defined 
by the Fourier transform have infinite (i.e., global) time support. For 
example, if we have a signal with periodic components plus a glitch at time 
to, we might want accurate knowledge of both the periodic component 


frequencies and the glitch time ([link]). 
x(t) |X(Q)| 
& 4 


ei la 9 Ft, 
TF —— 
A f* 
\w 


\ / 7 
J % 2% 


The Short-Time Fourier Transform (STFT) provides a means of joint time- 
frequency analysis. The STFT pair can be written 


CO 


Kota : x(t)w(t — r)e-@) at 


—CoO 


1 CO CO : 
#(t} = on i / Xstrr(Q, T)w(t — rei add? 


assuming real-valued w(t) for which f (|w(t)|)* d ¢ = 1. The STFT can 
be interpreted as a "sliding window CTFT": to calculate Xgrprr({, 7), slide 
the center of window w(t) to time 7, window the input signal, and compute 
the CTFT of the result ((link]). 

"Sliding Window CTFT" 


f(t) Q (t-t) 


The idea is to isolate the signal in the vicinity of time 7, then perform a 
CTFT analysis in order to estimate the "local" frequency content at time T. 


Essentially, the STFT uses the basis elements 
bo,-(t) = w(t — r)e 


over the range t € (—oo, oo) and 92 € (—oo, co). This can be understood 
as time and frequency shifts of the window function w(t). The STFT basis 
is often illustrated by a tiling of the time-frequency plane, where each tile 


represents a particular basis element ([link]): 
o At 


—_a 
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The height and width of a tile represent the spectral and temporal widths of 
the basis element, respectively, and the position of a tile represents the 
spectral and temporal centers of the basis element. Note that, while the 
tiling diagram suggests that the STFT uses a discrete set of time/frequency 


shifts, the STFT basis is really constructed from a continuum of 
time/frequency shifts. 


Note that we can decrease spectral width Ag at the cost of increased 
temporal width A; by stretching basis waveforms in time, although the 
time-bandwidth product A;\Q (i.e., the area of each tile) will remain 
constant ([link]). 


‘Zi WII 


Our observations can be summarized as follows: 


e the time resolutions and frequency resolutions of every STFT basis 


element will equal those of the window w(t). (All STFT tiles have the 


same shape.) 


e the use of a wide window will give good frequency resolution but poor 
time resolution, while the use of a narrow window will give good time 
resolution but poor frequency resolution. (When tiles are stretched in 


one direction they shrink in the other.) 


e The combined time-frequency resolution of the basis, proportional to 


1 
AAs? 
2 
all shapes, the Gaussian [footnote] w(t) = ee ) gives the 


highest time-frequency resolution, although its infinite time-support 
makes it impossible to implement. (The Gaussian window results in 
tiles with minimum area.) 

The STFT using a Gaussian window is known as the Gabor 
Transform (1946). 


is determined not by window width but by window shape. Of 


Finally, it is interesting to note that the STFT implies a particular definition 
of instantaneous frequency. Consider the linear chirp x(t) = sin (Qot”). 


From casual observation, we might expect an instantaneous frequency of 
(Qo7 at time 7 since 


Vt =7 : (sin(Qt”) = sin(Qrt)) 


The STFT, however, will indicate a time-7 instantaneous frequency of 


~ (Qot”) = 2N0T 


t=T 


Note:The phase-derivative interpretation of instantaneous frequency only 
makes sense for signals containing exactly one sinusoid, though! In 
summary, always remember that the traditional notion of "frequency" 
applies only to the CTFT; we must be very careful when bending the 


notion to include, e.g., "instantaneous frequency", as the results may be 
unexpected! 


Continuous Wavelet Transform 
This module introduces continuous wavelet transform. 


The STFT provided a means of (joint) time-frequency analysis with the 
property that spectral/temporal widths (or resolutions) were the same for all 
basis elements. Let's now take a closer look at the implications of uniform 
resolution. 


Consider two signals composed of sinusoids with frequency 1 Hz and 1.001 
Hz, respectively. It may be difficult to distinguish between these two signals 
in the presence of background noise unless many cycles are observed, 
implying the need for a many-second observation. Now consider two 
signals with pure frequencies of 1000 Hz and 1001 Hz-again, a 0.1% 
difference. Here it should be possible to distinguish the two signals in an 
interval of much less than one second. In other words, good frequency 
resolution requires longer observation times as frequency decreases. Thus, 
it might be more convenient to construct a basis whose elements have larger 
temporal width at low frequencies. 


The previous example motivates a multi-resolution time-frequency tiling of 
the form ([link]): 
|. 
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The Continuous Wavelet Transform (CWT) accomplishes the above multi- 
resolution tiling by time-scaling and time-shifting a prototype function ~(t) 
, often called the mother wavelet. The a-scaled and 7-shifted basis 
elements is given by 


where 
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The conditions above imply that a(t) is bandpass and sufficiently smooth. 
Assuming that || (¢) ||= 1, the definition above ensures that 
|| a,r(t) ||= 1 for all a and 7. The CWT is then defined by the transform 


pair 


Xowt(a,T) = [- x(t)pa,7(t) dt 
— 1 f® £% Xowr(a, 7) a,7(E) 


In basis terms, the CWT says that a waveform can be decomposed into a 
collection of shifted and stretched versions of the mother wavelet w(t). As 
such, it is usually said that wavelets perform a "time-scale" analysis rather 
than a time-frequency analysis. 


The Morlet wavelet is a classic example of the CWT. It employs a 
windowed complex exponential as the mother wavelet: 


ene eee 
p(t) = — ze OM e~S 
/ 27 


2 
log 2 


wavelet does not exactly satisfy the conditions established earlier, since 
(0) ~ 7x 10°’ £0, it can be corrected, though in practice the 
correction is negligible and usually ignored. 


where it is typical to select Qo = 7 . (See illustration.) While this 


(22) | 


While the CWT discussed above is an interesting theoretical and 
pedagogical tool, the discrete wavelet transform (DWT) is much more 
practical. Before shifting our focus to the DWT, we take a step back and 
review some of the basic concepts from the branch of mathematics known 
as Hilbert Space theory (Vector Space, Normed Vector Space, Inner Product 
Space, Hilbert Space, Projection Theorem). These concepts will be essential 
in our development of the DWT. 


Hilbert Space Theory 


Hilbert spaces provide the mathematical foundation for signal processing 
theory. In this section we attempt to clearly define some key Hilbert space 
orthonormal bases, and projections. The intent is not to bury you in 
mathematics, but to familiarize you with the terminology, provide intuition, 
and leave you with a "lookup table" for future reference. 


Vector Space 


e A vector space consists of the following four elements: 


1. A set of vectors V, 
2. A field of scalars (where, for our purposes, is either R or C), 
3. The operations of vector addition "+" (i.e., +: V x V > V) 


4. The operation of scalar multiplication 


""WVe,7: xVoV) 


for which the following properties hold. (Assumex A y A zE€V 


anda A Be .) 


Properties 


commutativity 


associativity 


distributivity 


additive identity 
additive inverse 


multiplicative 
identity 


Examples 

e+y=ytea 
(ety)+z2=ae4+(y+2) 
(af)a = a (Ba) 

a: (@+y) =(a-a) + (a-y) 
(a+ B)x =ax+ Ba 


Ve €V:(50,06€V: (#+0=2)) 


Ve €V: (d—-a,(-x2) €V: (a2 +—-a2 =0)) 


VaeV:(1-x=2) 


Important examples of vector spaces include 


Properties Examples 


real N- 


VER, aR 
vectors 
complex NV- V=cC’ -c 
vectors 
sequencesin WV = {x[n}] Sn © Z: (ya (elnI))” < 00) } 
San = C 
functions in" - 
gL," V={fO| [2 (FOI)? dt < co}, =C 


where we have assumed the usual definitions of addition and multiplication. 
From now on, we will denote the arbitrary vector space (V, , +, :) by the 
shorthand V and assume the usual selection of (, +, :). We will also 
suppress the ":" in scalar multiplication, so that a - x becomes az. 


e A subspace of V is asubset M C V for which 


1.Ve,yexeM A yeEeM:((a+y) eM) 
2.vVeEeM AN ae : (are M) 


Note:Note that every subspace must contain 0, and that V is a 
subspace of itself. 


e The span of set S C V is the subspace of V containing all linear 
combinations of vectors in S. When S = {xo,...,ZN-1}, 


N-1 
span (S) = {Sa a; € 
i=0 


e A subset of linearly-independent vectors {xo,..., 2-1} C V is 
called a basis for V when its span equals V. In such a case, we say 
that V has dimension NV. We say that V is infinite-dimensional 
[footnote] if it contains an infinite number of linearly independent 
vectors. 

The definition of an infinite-dimensional basis would be complicated 
by issues relating to the convergence of infinite series. Hence we 
postpone discussion of infinite-dimensional bases until the Hilbert 
Space section. 

e V is a direct sum of two subspaces M and N, written V = M@N, 
iff every x € V has a unique representation z = m+n form ec M 
andn € N. 


Note:Note that this requires Mm N = {0} 


Normed Vector Space 
Now we equip a vector space V with a notion of "size". 


e A norm is a function (|| - ||: V — R) such that the following 
properties hold(Va,y,wz EV A yeEV:(@EV A ye V)and 
Va,a€ :(a€ )): 


1. || a ||> 0 with equality iff 2 = 0 
2. || aw ||= lal: || @ || 
3. | 


, (the triangle inequality). 


In simple terms, the norm measures the size of a vector. Adding the 
norm operation to a vector space yields a normed vector space. 
Important example include: 


(ae. . tn-1)" = \/ par x;2 = Val zx 


2.V= (xo...ty_1)* |= a “(lel)” = Vale 
3.V= x[n]} |= (Sooo (| nll)?)* 
Lv ps || F(t) [P= an 


Inner Product Space 
Next we equip a normed vector space V with a notion of "direction". 


e An inner product is a function ( ((-,-) : V x V) + C) such that the 
following properties hold ( 
Vea,y,z,2EVAyEVAZEV:(@EVAYEV A ZEV) 
and Va,a@€ :(a€ )): 


y 
,ay) = a ((x, y)) ...implying that (ax, y) = a ((x, y)) 
y +z) = (@,y) + (2, 2) 

,a) > 0 with equality iff « = 0 

In simple terms, the inner product measures the relative alignment 
between two vectors. Adding an inner product operation to a vector 
space yields an inner product space. Important examples include: 


1.V=R", (2, y) = y 

2.V=C%, (ay) = oty 

3.V = la, ({x|n]}, fulnl}) = eh 
4.V = Lo, (f(t), 9(t)) = fo, fg) 4 


The inner products above are the "usual" choices for those spaces. 


The inner product naturally defines a norm: 
| a |[= 4/ (x, a) 


though not every norm can be defined from an inner product. [footnote | 
Thus, an inner product space can be considered as a normed vector space 
with additional structure. Assume, from now on, that we adopt the inner- 
product norm when given a choice. 

An example for inner product space 4 would be any norm 


ll F = “/f% (F OI? at such that p > 2. 


e The Cauchy-Schwarz inequality says 


(@,y)| <i] x [I ll yl 
with equality iff fa € :(# = ay). 


When ((a, y)) € R, the inner product can be used to define an "angle" 
between vectors: 


(x,y) 


cos(@) = ———————_ 
= Tel lal 


e Vectors x and y are said to be orthogonal, denoted as x | y, when 
(a, y) = 0. The Pythagorean theorem says: 


ve Ly: ((le+y ll’ =(lel)?+ (lai) 


Vectors a and y are said to be orthonormal when x | y and 
| @ |=] y ||= 1. 

e xz | Smeansz | y forall y € S. S is an orthogonal set if x | y 
foralla A ye Ss.t.a2 4 y. An orthogonal set S is an orthonormal 


set if || a ||= 1 for all 2 € S. Some examples of orthonormal sets are 
1 0 
PRES Oe F 
0 0 


2. C% : Subsets of columns from unitary matrices 

3. lz : Subsets of shifted Kronecker delta functions 
Sc {{d[n — k]}| k € Z} 

4/252 5 = {+ f(e —nT) ne z} for unit pulse 
f(t) = u(t) — u(t — T), unit step u(t) 


where in each case we assume the usual inner product. 


Hilbert Spaces 


Now we consider inner product spaces with nice convergence properties that allow us to define 
countably-infinite orthonormal bases. 


e A Hilbert space is a complete inner product space. A complete [footnote] space is one where 
all Cauchy sequences converge to some vector within the space. For sequence {z,,} to be 
Cauchy, the distance between its elements must eventually become arbitrarily small: 


Ve,e >0: (AN. : (Vn,m,(n > Nz) A (m>QN-): (|| fn — 2m ||< €))) 


For a sequence {z,,} to be convergent to x, the distance between its elements and # must 
eventually become arbitrarily small: 


Ve,e >0:(AN,: (Vn,n > N;: (|| tn — 2 ||< €))) 
Examples are listed below (assuming the usual inner products): 


1.V=R” 
2V=C" 
3. V = I, (i.e., square summable sequences) 
4.V = &, (i.e., square integrable functions) 


The rational numbers provide an example of an incomplete set. We know that it is possible to 
construct a sequence of rational numbers which approximate an irrational number arbitrarily 
closely. It is easy to see that such a sequence will be Cauchy. However, the sequence will not 
converge to any rational number, and so the rationals cannot be complete. 

e We will always deal with separable Hilbert spaces, which are those that have a countable 
[footnote] orthonormal (ON) basis. A countable orthonormal basis for V is a countable 
orthonormal set S = {x;,} such that every vector in V can be represented as a linear 
combination of elements in S: 


VyyeV: (sto : (» = by 3) 
k 


Due to the orthonormality of S, the basis coefficients are given by 


We can see this via: 


n nr n 
(2k Y) = (ssn Sais) = limit (sn. Soa| = limit 2 (Le, Li)) = Ok 


where 6[k — 7] = (a, 2;) (where the second equality invokes the continuity of the inner 
product). In finite n-dimensional spaces (e.g., R” or C”), any n-element ON set constitutes an 
ON basis. In infinite-dimensional spaces, we have the following equivalences: 


1. {x9, £1, £2,...} is an ON basis 


2. If (x;, y) = 0 for all z, then y = 0 


3.¥y,y EV: ((Il wll)” =X, (\(wisy)1)”) @arseval’s theorem) 
4. Every y € V isa limit of a sequence of vectors in span ({x0, 41, £2, ...}) 


Examples of countable ON bases for various Hilbert spaces include: 


1 Leh rayen a Pioreg—= (0) sing DO DO. es 0)7 with "1" in the i*® position 
2. C”: same as IR” 

3. la: {{6;[n]}| 2 € Z}, for {6;[n]} = {6[n — 2]} (all shifts of the Kronecker sequence) 
4, Ly: to be constructed using wavelets ... 


A countable set is a set with at most a countably-infinite number of elements. Finite sets are 
countable, as are any sets whose elements can be organized into an infinite list. Continuums 
(e.g., intervals of R) are uncountably infinite. 

e Say S is a subspace of Hilbert space V. The orthogonal complement of S in V, denoted S*, is 
the subspace defined by the set {a € V| a | S}. When S is closed, we can write 
V=S@S- 

¢ The orthogonal projection of y onto S, where S is a closed subspace of V, is 


9= Yo ((wi,y))2: 


s.t. {x;} is an ON basis for S. Orthogonal projection yields the best approximation of y in S: 
gj =argmin|| y — 2 || 
es 


The approximation error e = y — ¥ obeys the orthogonality principle: 
elS 


We illustrate this concept using V = R? ([link]) but stress that the same geometrical 
interpretation applies to any Hilbert space. 


A proof of the orthogonality principle is: 
eL Sei: (4) =0) 
(y = Y, xj) = 0 


Equation: 


(y, x4) a 


Discrete Wavelet Transform: Main Concepts 


Main Concepts 


The discrete wavelet transform (DWT) is a representation of a signal 

x(t) € Y2 using an orthonormal basis consisting of a countably-infinite set 
of wavelets. Denoting the wavelet basis as {#i..n(t)|k EZ A n € Z}, the 
DWT transform pair is 

Equation: 


x(t) = S- 5 din Wkn(t) 
k 


=—c n=—C 
Equation: 


dim = (Wen(t), e(t)) 
i. Wrn(t)z(t) dt 


where {dj,,,} are the wavelet coefficients. Note the relationship to Fourier 
series and to the sampling theorem: in both cases we can perfectly describe 
a continuous-time signal x(t) using a countably-infinite (i.e., discrete) set 
of coefficients. Specifically, Fourier series enabled us to describe periodic 
signals using Fourier coefficients { X[k]| k € Z}, while the sampling 
theorem enabled us to describe bandlimited signals using signal samples 
{z[{n]|n € Z}. In both cases, signals within a limited class are represented 
using a coefficient set with a single countable index. The DWT can describe 
any signal in“ using a coefficient set parameterized by two countable 
indices: {din|kEZ A ne Z}. 


Wavelets are orthonormal functions in obtained by shifting and 


stretching a mother wavelet, w(t) € %). For example, 
Equation: 


Vk,nk A neZ: (Yen(t) = 2-39 (2-*t n)) 


defines a family of wavelets {Wi.n(t)|k © Z A n © Z} related by power- 
of-two stretches. As k increases, the wavelet stretches by a factor of two; as 
nm increases, the wavelet shifts right. 


Note: When || 7)(t) ||= 1, the normalization ensures that || Hzn(t) ||= 1 
forallk € Z,n € Z. 


Power-of-two stretching is a convenient, though somewhat arbitrary, 
choice. In our treatment of the discrete wavelet transform, however, we will 
focus on this choice. Even with power-of two stretches, there are various 
possibilities for 7)(t), each giving a different flavor of DWT. 


Wavelets are constructed so that {z,n(t)|n € Z} (ie., the set of all shifted 
wavelets at fixed scale k), describes a particular level of 'detail' in the 
signal. As k becomes smaller (i.e., closer to —oo), the wavelets become 
more "fine grained" and the level of detail increases. In this way, the DWT 
can give a multi-resolution description of a signal, very useful in analyzing 
"real-world" signals. Essentially, the DWT gives us a discrete multi- 
resolution description of a continuous-time signal in 7%. 


In the modules that follow, these DWT concepts will be developed "from 
scratch" using Hilbert space principles. To aid the development, we make 
use of the so-called scaling function y(t) € 2, which will be used to 
approximate the signal up to a particular level of detail. Like with 
wavelets, a family of scaling functions can be constructed via shifts and 
power-of-two stretches 

Equation: 


Vkn,k AneZ: (ean(t) = 2-Fy(2-*¢ — n)) 


given mother scaling function y(t). The relationships between wavelets and 
scaling functions will be elaborated upon later via theory and example. 


Note: The inner-product expression for dz, [link] is written for the 
general complex-valued case. In our treatment of the discrete wavelet 
transform, however, we will assume real-valued signals and wavelets. For 


this reason, we omit the complex conjugations in the remainder of our 
DWT discussions 


The Haar System as an Example of DWT 


The Haar basis is perhaps the simplest example of a DWT basis, and we 
will frequently refer to it in our DWT development. Keep in mind, however, 
that the Haar basis is only an example; there are many other ways of 
constructing a DWT decomposition. 


For the Haar case, the mother scaling function is defined by [link] and 
[link]. 


Equation: 
ine 1if0<t<l 
PY) = 0 otherwise 
a(t) 
0 1? 


From the mother scaling function, we define a family of shifted and 
stretched scaling functions {yzn(t) } according to [link] and [link] 
Equation: 


prn(t) = Vk,n,keZneZ: (2-Fp(2"t — n)) 


= 24o(k (tnd) 


em 


g-k/2 


lie | es 
2 Via (n+1)2* 


which are illustrated in [link] for various k and n. [link] makes clear the 
principle that incrementing n by one shifts the pulse one place to the right. 
Observe from [link] that {y;,(t)| € Z} is orthonormal for each k (i.e., 
along each row of figures). 


-1,o(t) 


2 


A Hierarchy of Detail in the Haar System 


Given a mother scaling function y(t) € “2 — the choice of which will be 
discussed later — let us construct scaling functions at "coarseness-level-k” 
and "shift-n" as follows: 


Pan(t) = 2-7 y(2-*t — n). 


Let us then use Vz to denote the subspace defined by linear combinations of 
scaling functions at the k*" level: 


Ve = span ({yen(t)|n € Z}). 


In the Haar system, for example, Vo and V, consist of signals with the 
characteristics of xg(t) and x(t) illustrated in [link]. 
a(t) 


I 
ab 1_|2 (3 |4 (5 16 7, 
=| = 


a(t) 


We will be careful to choose a scaling function y(t) which ensures that the 
following nesting property is satisfied: 


 CWCVYCVWCV1CV_oC... 
coarse detailed 


In other words, any signal in V; can be constructed as a linear combination 
of more detailed signals in V,_, . (The Haar system gives proof that at 
least one such y(t) exists.) 


The nesting property can be depicted using the set-theoretic diagram, [link], 
where V_ is represented by the contents of the largest egg (which includes 
the smaller two eggs), Vo is represented by the contents of the medium- 
sized egg (which includes the smallest egg), and V; is represented by the 
contents of the smallest egg. 


vu WN Vi 


rd 


Going further, we will assume that y(t) is designed to yield the following 
three important properties: 


1. {yY~,n(t)|n € Z} constitutes an orthonormal basis for Vj, 

2. V,. = {0} (contains no signals). [footnote] 
While at first glance it might seem that V. should contain non-zero 
constant signals (e.g., x(t) = a for a € R), the only constant signal in 
-L , the space of square-integrable signals, is the zero signal. 

3. V_o = & (contains all signals). 


Because {yxn(t)| 2 € Z} is an orthonormal basis, the best (in 2 norm) 
approximation of x(t) € -Z2 at coarseness-level-k is given by the 
orthogonal projection, [link] 

Equation: 


x ),(t) = S> Cyn Pkn(t) 


Equation: 


Chn = (Pkyn(t), 2(t)) 


We will soon derive conditions on the scaling function y(t) which ensure 
that the properties above are satisfied. 


Haar Approximation at the kth Coarseness Level 


It is instructive to consider the approximation of signal z(t) € 2 at 
coarseness-level-k of the Haar system. For the Haar case, projection of 
x(t) € YL onto V;, is accomplished using the basis coefficients 
Equation: 


Chn = Joo Phn(t)a(t) dt 
= fei" o-Fa(t) dt 


n2k 


giving the approximation 
Equation: 


te(t) = Yoo CknPk,n(t) 


n=—oo JngQk 


nr k pL 
Te Fe IG a(t) dt (27 ynn(t)) 


where 


1 (n+1)2* 


oe x(t) dt = average value of x(t) in interval 
n2k 


Vet (22 Prn(t) = height = 1) 
This corresponds to taking the average value of the signal in each interval 


of width 2* and approximating the function by a constant over that interval 
(see [link]). 


The Scaling Equation 


Consider the level-1 subspace and its orthonormal basis: 
Equation: 


V pnt n 


Equation: 


Oe A == - SE ht 


SinceV V (ie, V ismoredetailedthanV )andsincey t V~, 


there must exist coefficients hn n such that 
Equation: 
p t hnyp nt 

Equation: 

—yp —t hnpt n 
Equation: 

Scaling Equation 
pt 7 hnyp t n 


To be a valid scaling function, y ¢ must obey the scaling equation for 
some coefficient set hn 


The Wavelet Scaling Equation 


The difference in detail between Vz and V,_; will be described using Wx , 
the orthogonal complement of Vz in Vp_1: 
Equation: 


Ve_-1 = Ve ® We 


At times it will be convenient to write W; = Vz. This concept is 


illustrated in the set-theoretic diagram, [link]. 
Ke 


Suppose now that, for each k € Z, we construct an orthonormal basis for 
W;, and denote it by {Wzn(t)|n € Z}. It turns out that, because every V;, 
has a basis constructed from shifts and stretches of a mother scaling 


function (i.e., Yen(t) = 2-7 (2-*t - n), every W;, has a basis that can 
be constructed from shifts and stretches of a "mother wavelet" a(t) € LZ: 


bin (t) = 2-7 (2-*t — n). 
The Haar system will soon provide us with a concrete example . 


Let's focus, for the moment, on the specific case k = 1. Since W, C Vo, 
there must exist {g[n|| € Z} such that: 
Equation: 


~10(t) = 3 gn] Pon() 


n=—OCoO 


= 0(5t) = Do alnlete—m 


n=— Co 


Equation: 
Wavelet Scaling Equation 


w(t) = V2 S> gin]y(2t —n) 


n=—CO 


To be a valid scaling-function/wavelet pair, y(t) and w(t) must obey the 
wavelet scaling equation for some coefficient set {g{n]}. 


Conditions on h[n] and g[n] 


Here we derive sufficient conditions on the coefficients used in the scaling equation and wavelet 
scaling equation that ensure, for every k € Z, that the sets {yz,n(t)|n € Z} and {Wen(t)|n € Z} 
have the orthonormality properties described in The Scaling Equation and The Wavelet Scaling 
Equation. 


For {Yxn(t)| € Z} to be orthonormal at all k, we certainly need orthonormality when & = 1. This 
is equivalent to 
Equation: 


dlm] = (¥i0(t), P1m(t)) 
( 


where 6[n — £+ 2m] = (p(t — n), p(t — 2 — 2m)) 
Equation: 


There is an interesting frequency-domain interpretation of the previous condition. If we define 
Equation: 


h[m]*h[—m] 
Vin h[n]h[n — m] 


p|m| 


then we see that our condition is equivalent to p[2m] = 6{m]. In the z-domain, this yields the pair of 
conditions 
Equation: 

Power-Symmetry Property 


PaASjAZAle*) 


l= 1/2) > P(2!e"#*) _ 1/2P(2') 4 1/2P(-2") 


p=0 


Putting these together, 
Equation: 


2 2= (\H(e)|)° + (JH(e*) ) 


where the last property invokes the fact that h{n] € IR and that real-valued impulse responses yield 
conjugate-symmetric DTFTs. Thus we find that h[n] are the impulse response coefficients of a 
power-symmetric filter. Recall that this property was also shared by the analysis filters in an 
orthogonal perfect-reconstruction FIR filterbank. 


Given orthonormality at level k = 0, we have now derived a condition on h/n] which is necessary 
and sufficient for orthonormality at level k = 1. Yet the same condition is necessary and sufficient for 
orthonormality at level k = 2: 

Equation: 


6[m] » P2m(t)) 

a nlr|er a(t), die hl ere+om(t)) 
den Pin] d7e PIE] ((Pin(t), P1,e+2m(t))) 
yin =—oo h{n]h[n — 2m] 


—oCo 


I 


Pte TE oii 
aS) 
N 
i=) 
“~~ 
of 
YY 
6 


I 


where 6[n — £+ 2m] = (yin(t), P1,2+2m(t)). Using induction, we conclude that the previous 
condition will be necessary and sufficient for orthonormality of {yxn(t)|n € Z} for all k € Z. 


To find conditions on {g|n]} ensuring that the set {¢,,,(t)|n € Z} is orthonormal at every k, we can 
repeat the steps above but with g[n] replacing h[n], #z,n(t) replacing yz, (t), and the wavelet- 
scaling equation replacing the scaling equation. This yields 

Equation: 


2 = G(2)G(z) + G(-2)G(-z) 


Next derive a condition which guarantees that W;, Vz, as required by our definition W, = Vey for 
all k € Z. Note that, for any k € Z, Wy, 1 Vz is guaranteed by 

{rn(t)|n € Z} L {prn(t)|n € Z} which is equivalent to 

Equation: 


0 = (He+1,0(t), Pr+1m(t)) 

(din lr] Prn(t), doe Pll pe,e+2m(t)) 
din IM] dig MIE] ((Prn(t)s Pa,e+2m(t))) 
yin II 


ng|n|h[n — 2m 


I 


for all m where 6[n — £+ 2m] = (Yen(t), Pk,e+2m(t)). In other words, a 2-downsampled version of 
g|n|*h|—n] must consist only of zeros. This necessary and sufficient condition can be restated in the 
frequency domain as 

Equation: 


0=1/2 3 G(z'%e-W9)) (2120) 


p=0 


The choice 
Equation: 


satisfies our condition, since 


G(z)H(z"*) + G(-z)H(-z") =+ (--?H((-2)") A(z) =e zPa(z")H((-2)*) =0 


In the time domain, the condition on G(z) and H(z) can be expressed 
Equation: 


Vodd P : (gjn] = +(—1"A[P — n))). 


Recall that this property was satisfied by the analysis filters in an orthogonal perfect reconstruction 
FIR filterbank. 


Note that the two conditions 
Vodd P : (G(2) - +(2?H((-2)"))) 


2 = A(z)H(z") + H(-2z)A(-z"") 


are sufficient to ensure that both {y;,,(t)|n € Z} and {%;,,(t)| € Z} are orthonormal for all & 
and that W; | V;, for all k, since they satisfy the condition 2 = G(z)G(z) + G(-z)G(-z"") 
automatically. 


Values of g[n] and h[n] for the Haar System 


The coefficients {h[n]} were originally introduced at describe y1(t) in 
terms of the basis for Vo: 


gio(t) = S— h[n|yon(t) 


From the previous equation we find that 
Equation: 


(Yo,m(t), P1,0(t)) (Yo,m(t); don Bln] Po,n(t)) 
= dVinhln] ((Yom(t), Yon(t))) 


h{m| 


where d[n — m] = (Yo,m(t), Yo,n(t)), which gives a way to calculate the 
coefficients {h[m]} when we know 94,n(t). 


In the Haar case 
Equation: 


h{m| = f° Vom(t)yro(t) dt 
jer ~10(t) d t 
‘3 if m € {0,1} 


0 otherwise 


since Y1,0(t) = ae in the interval |0, 2) and zero otherwise. Then choosing 
P = 1ing[n| = —1"h(P — n), we find that 
a if O 
ak Soe 
g|n mie 
0 otherwise 


for the Haar system. From the wavelet scaling equation 
W(t) = V2 }° g[n]p(2t — n) = p(2t) — y(2t - 1) 


we can see that the Haar mother wavelet and scaling function look like in 
[link]: 
4 o(t) ¥(t) 


It is now easy to see, in the Haar case, how integer shifts of the mother 
wavelet describe the differences between signals in V_; and Vo ([link]): 


a(t) Ee V_, 


wo 
nin 


We expect this because V_1 = Vo 6 Wo. 


Wavelets: A Countable Orthonormal Basis for the Space of Square- 
Integrable Functions 


Recall that Vi = Wii ® Vesi and that Vir, = Weyo ® Vero. Putting 
these together and extending the idea yields 
Equation: 


Ve = West 8 Wryo 8 Vero 
Wr 8 Weie2 ®... BWe @ Ve 
= Wrist 8 Wrire 8 Wiig... 


2 ay We 
ae, 


If we take the limit as k — —ox, we find that 


Equation: 
Zo. = Vig 
= © (Wi) 
Moreover, 
Equation: 
(Wy, LV) A (Wese C Vi) > (Wy L Wee) 
Equation: 


(W2 ae V2) /\ (Wrsg C V2) —- (W2 als Wrss3) 


from which it follows that 
Equation: 


Wr L Wye 


or, in other words, all subspaces W; are orthogonal to one another. Since 
the functions {zn (t)| € Z} form an orthonormal basis for Wx , the 
results above imply that 

Equation: 


{kn(t)|n A k € Z}constitutes an orthonormal basis forLy 


This implies that, for any f(t) € 2, we can write 
Equation: 


= 3 dy[m] pe m(t) 
k=—oo M=— CO 


Equation: 


dj, [m] = (Ye m(t), f(t) 


This is the key idea behind the orthogonal wavelet system that we have 
been developing! 


Filterbanks Interpretation of the Discrete Wavelet Transform 


Assume that we start with a signal x(t) € . Denote the best approximation at the 0* level of coarseness by 
x(t). (Recall that z(t) is the orthogonal projection of x(t) onto Vo .) Our goal, for the moment, is to decompose 
Z(t) into scaling coefficients and wavelet coefficients at higher levels. Since zo(t) € Vo and Vo = Vi © Wi, 
there exist coefficients {cg[n]}, {ci[n]}, and {d1[n]} such that 

Equation: 


to(t) = Yan col] ¥on(é) 
Man Clr] ¢inlt] + Van Alri nlt] 


I 


Using the fact that {y,,,,(#)| m € Z} is an orthonormal basis for Vj , in conjunction with the scaling equation, 
Equation: 


ci[n| 


| 


to(t), Yin(t)) 

Da aut 

] ((Yo,m 

] (e(t — m), ye h[dle(t — £— 2n))) 

m ©O cl Lew hid] (elt — m), e(t — £ — 2n))) 
| = 


w(t); Prn(t)) 
(t), Pin(t))) 


I 


( 
omm 
= Doan 
dim 
a 


I 


where d[t — £— 2n] = (y(t — m), p(t — £ — 2n)). The previous expression ([link]) indicates that {c,[n]} results 
from convolving {co|m]} with a time-reversed version of h|m] then downsampling by factor two ({link]). 


co{m] —- G(z-*) ~ $2 —+ d,[n] 


Using the fact that {71(t)| € Z} is an orthonormal basis for Wj , in conjunction with the wavelet scaling 
equation, 
Equation: 


di[n] = (zo(t), Pin(t)) 
= (Yinm Colm] yo 
=. Dan 


sm(t); b1,n(€)) 
oy Pin(t))) 


—_~ 
aS 


co[m| 
= Limm Colm] ( m), die gléle(t — £ — 2n))) 
= Limm Colm] dee a (p(t — m), p(t — £— 2n))) 
= pares co [m]g[m - 2n] 


where é[t — £— 2n] = (y(t — m), p(t — £— 2n)). 


The previous expression ([link]) indicates that {d;|n]} results from convolving {co|m]} with a time-reversed 
version of g[m] then downsampling by factor two ([link]). 


co[m] —+| G(z) |—+ 42 _—- di [n} 


Putting these two operations together, we arrive at what looks like the analysis portion of an FIR filterbank 


([link]): 


H(z!) || [2 =e [n] 
co[m] ; 
G(z7t) #4} [2 = di[n] 


We can repeat this process at the next higher level. Since V; = W2 @ Va, there exist coefficients {c2[n]} and 
{d2[n]} such that 
Equation: 


tilt) = Yin cilr] yi n(t) 
Yin d2[n\ban(t) + Donn C2lr|P2,n(t) 


Using the same steps as before we find that 
Equation: 


coln] = S> ex [m|h[m — 2n] 
Equation: 


da{n] = > e[mlglm — 2n] 


which gives a cascaded analysis filterbank ([(link]): 


e;[m] —| H(2-*) |» {2 |» op{n] 


— H(2z-*) | | 2 = G(z-!) + {2 = da{n} 


€0 [4] ——*) G(z~*) --*) 42 ~ dy{m| 


If we use Vy = W, @ W, © Wz © --- © W;, @ V;,, to repeat this process up to the k** level, we get the iterated 
analysis filterbank ([link]). 


- of » 12 ~ dia) 
cain) eo A{e-*) em 42 
alm) ee Aiea *) |e yo ~ Gla") me 12 ~ dip) 
= Hi{s—') |» 12 ™ Gi") as 12 = dain) 


As we might expect, signal reconstruction can be accomplished using cascaded two-channel synthesis filterbanks. 
Using the same assumptions as before, we have: 
Equation: 


colm] = (xo(t), Pom(t)) 

(Man Culm] Pin (t) + Lan Alr}Pin(t); Pom(t)) 

yin 1ln] (Grn (Et), Pom(t))) + Lenn A1lr] ((b1n(t), Po,m(t))) 
Yinn Cilm|alm — 2n] + do nn dilm|glm — 2n] 


I 


nn& 


where h[m — 2n] = (Y1in(t), Yom(t)) 
and glm — 2n| = (Yin(t), Pom(é)) 


which can be implemented using the block diagram in [link]. 


j-{t2|-[ He) 
)-[12/-+| Ge) 


The same procedure can be used to derive 
Equation: 


@-+ alm 


= Si cg[njh[m — 2n] +S do[n]g[m — 2n] 


from which we get the diagram in [link]. 


cp[n]—=| $2 | H(z) 


ci{m] 


dp{n]—=|t2|-+| G(z) L@—{t2b+| He) |H 


d; {m] +{t2;-=| Giz) cold] 


To reconstruct from the kt# level, we can use the iterated synthesis filterbank ((link]). 


exim] —12—= Hie) 


} ential 
aula) T2i—s Glz) 4) 


72 - AL 
& eqinl 
dst) 12\-= Gz) —Gp-#/12 = His) 
} enim 
dyin) 12 \-e Gis) a t2)-e Ale) 
2 
ifm] = 12 Gir) —G-- ltl 


The table makes a direct comparison between wavelets and the two-channel orthogonal PR-FIR filterbanks. 


Discrete Wavelet Transform 2-Channel Orthogonal PR-FIR Filterbank 
Analysis- = 
ie H(z ) H(z) 
aus H(z)H(z"!) + H(-z)H(-2z71) =2 Ho(z)Ho(z-1) + Ho(—z)Ho(—z7 


Symmetry 


Discrete Wavelet Transform 2-Channel Orthogonal PR-FIR Filterbank 


or | Oe") (2) 

Sa VP, P is odd : (G(z) = +(z-? H(-z7))) VN, Nis even : (Hi(z) = +(z-8-) Hy (- 
ae H(z) Go(z) = 22° 8-D Hy (27) 

Sa G(z) Gi(z) = 22-N-)) A, (2) 


From the table, we see that the discrete wavelet transform that we have been developing is identical to two-channel 
orthogonal PR-FIR filterbanks in all but a couple details. 


1. Orthogonal PR-FIR filterbanks employ synthesis filters with twice the gain of the analysis filters, whereas in 
the DWT the gains are equal. 

2. Orthogonal PR-FIR filterbanks employ causal filters of length N, whereas the DWT filters are not 
constrained to be causal. 


For convenience, however, the wavelet filters H(z) and G(z) are usually chosen to be causal. For both to have 
even impulse response length N, we require that P = N — 1. 


Initialization of the Wavelet Transform 


The filterbanks developed in the module on the filterbanks interpretation of 
the DWT start with the signal representation {co{n|| € Z} and break the 
representation down into wavelet coefficients and scaling coefficients at 
lower resolutions (i.e., higher levels k). The question remains: how do we 
get the initial coefficients {cg|n] }? 


From their definition, we see that the scaling coefficients can be written 
using a convolution: 
Equation: 


coln] = (p(t —n), x(t) 
[2 o(t — n)a(t) dt 
= (—t)*2(t)| 7 


which suggests that the proper initialization of wavelet transform is 
accomplished by passing the continuous-time input x(t) through an analog 
filter with impulse response y(—t) and sampling its output at integer times 


({link]). 


z(t) oo }Z as ey [m] ae 


(H(z) Le! 42 2b ale 


Glz Ga) | = 42 42} + ata) 


Practically speaking, however, it is very difficult to build an analog filter 
with impulse response y(—t) for typical choices of scaling function. 


The most often-used approximation is to set co[n] = x|n]. The sampling 
nr) 


theorem implies that this would be exact if y(t) = , though clearly 
this is not correct for general y(t). Still, this adintne: is somewhat 
justified if we adopt the view that the principle advantage of the wavelet 
transform comes from the multi-resolution capabilities implied by an 
iterated perfect-reconstruction filterbank (with good filters). 


Regularity Conditions, Compact Support, and Daubechies' Wavelets 


Here we give a quick description of what is probably the most popular 
family of filter coefficients h|n] and g/n| — those proposed by Daubechies. 


Recall the iterated synthesis filterbank. Applying the Noble identities, we 
can move the up-samplers before the filters, as illustrated in [link]. 


lel —{t2'}-] mize) jG 


dj{q] —-{t2'| Ge") Tg H(2™) 


P= calf 


dsp] | t8 || G(e4)H(e2)H(z) LY 


dajn] —e| $4] +4 G(z?) H(z) 


d; [rn] 2 G(z} _| 


The properties of the 7-stage cascaded lowpass filter 
Equation: 


in the limit 2 — oo give an important characterization of the wavelet 
system. But how do we know that limit H™ (ec) converges to a response 
t1—0o 


in LY, ? In fact, there are some rather strict conditions on H (e’”) that must 


be satisfied for this convergence to occur. Without such convergence, we 
might have a finite-stage perfect reconstruction filterbank, but we will not 
have a countable wavelet basis for “ . Below we present some "regularity 
conditions" on H (e”) that ensure convergence of the iterated synthesis 


lowpass filter. 


Note:The convergence of the lowpass filter implies convergence of all 
other filters in the bank. 


Let us denote the impulse response of H(z) by A [n]. Writing 
H(z) = H(2?") A V(2) 
in the time domain, we have 


hin] = S— h[k]h& [n — 2°-1k] 
kk 


Now define the function 
yp) (t) = 22 = hw) [n] Fin /2 41/2) (t) 


where .¥|,,»)(t) denotes the indicator function over the interval [a, 6): 


_ {1 if t € [a,b) 
Fa,) (t) i i if: g la, b) 


The definition of y“) (t) implies 
Equation: 


nm ntl). (,@) — 9-i,,0 
vite |=, 7 BG In] = 2-Ty (t)) 


Equation: 


1 : . i-1 3 
Vt,t € “) . (a In — 21k] = 2-8 yD (a4 k)) 


and plugging the two previous expressions into the equation for Ao [n] 
yields 
Equation: 


p(t) = V2S— h[kje* [at — ky. 
kk 


Thus, if yp (t) converges pointwise to a continuous function, then it must 
satisfy the scaling equation, so that limit y(t) = y(t). Daubechies 
1-00 


showed that, for pointwise convergence of vy) (t) to a continuous function 
in LY, , it is sufficient that H (e”) can be factored as 
Equation: 


ve.p >: (a(er) = va(4*)’ ales) 


for R(e”) such that 
Equation: 


sup (R(e™)|) < 2° 


Here P denotes the number of zeros that H (eo) has at w = 7. Such 
conditions are called regularity conditions because they ensure the 
regularity, or smoothness of y(t). In fact, if we make the previous 
condition stronger: 

Equation: 


votes 1% (sup (|R(e) ) < rat) 


then limit y(t) = y(t) for y(t) that is ¢-times continuously 


100 


differentiable. 


There is an interesting and important by-product of the preceding analysis. 

If h[n] is a causal length-N filter, it can be shown that h [n] is causal with 

length N* = 2‘ (N — 1) +1. By construction, then, y [¢] will be zero 

outside the interval 0, ae et . Assuming that the regularity conditions 

are satisfied so that limit y(t) = y(t), it follows that y(t) must be zero 
too 


outside the interval [0, N — 1]. In this case we say that y(t) has compact 
support. Finally, the wavelet scaling equation implies that, when y(t) is 
compactly supported on [0, NV — 1] and g[n] is length N, ~(t) will also be 
compactly supported on the interval [0, NV — 1]. 


Daubechies constructed a family of H(z) with impulse response lengths 
N € {4,6, 8, 10,...} which satisfy the regularity conditions. Moreover, 
her filters have the maximum possible number of zeros at w = 7, and thus 
are maximally regular (i.e., they yield the smoothest possible y(t) for a 
given support interval). It turns out that these filters are the maximally flat 
filters derived by Herrmann long before filterbanks and wavelets were in 
vogue. In [link] and [link] we show y(t), (2), w(t), and ¥(2) for 
various members of the Daubechies' wavelet system. 


See Vetterli and Kovacivi¢ for a more complete discussion of these matters. 
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Computing the Scaling Function: The Cascade Algorithm 
This module shows how to compute the scaling function. It also has a 
section with a proof for an assumption made for the computation. 


Given coefficients {h[n]} that satisfy the regularity conditions, we can 
iteratively calculate samples of y(t) on a fine grid of points {¢} using the 
cascade algorithm. Once we have obtained y(t), the wavelet scaling 
equation can be used to construct w(t). 


In this discussion we assume that H(z) is causal with impulse response 
length NV. Recall, from our discussion of the regularity conditions, that this 
implies y(t) will have compact support on the interval [0, N — 1]. The 
cascade algorithm is described below. 


1. Consider the scaling function at integer times 
t=me {0,...,N—1}: 


N-1 


v(m) = V2 > h(n)p(2m — n) 


n=0 


Knowing that y(t) = 0 fort ¢ [0, N — 1], the previous equation can 
be written using an VxN matrix. In the case where N = 4, we have 


Equation: 

y(0) h{o} O 0 0 ~(0) 
el) _ fy hla] All] Alo] 0 y(1) 
(2) 0 Als} Al2} All] = (2) 
9(3) 0 0 0 Als] ¥f3) 

h{o} oO 0 0 

where . M2) RU Alo] 0 

0 Al3} Al2] All] 

0 0 0 Af3] 


The matrix A is structured as a row-decimated convolution matrix. 
From the matrix equation above ([link]), we see that 
(y~(0)y(1)y(2)y(3))* must be (some scaled version of) the 


eck 
eigenvector of H corresponding to eigenvalue (v 2) . In general, 


the nonzero values of {y(n)|n € Z}, ie., (p(0)y(1)...p(N —1))* 
, can be calculated by appropriately scaling the eigenvector of the Nx 
N row-decimated convolution matrix H corresponding to the 


eigenvalue (v2) . It can be shown that this eigenvector must be 


scaled so that 3’! y(n) = 1. 


n=0 
. Given {y(n)|n € Z}, we can use the scaling equation to determine 


e(z) ne Zy: 


Equation: 
(+) = ay h[n|y(m — n) 


This produces the 2/V — 1 non-zero samples 


{(0), y(1/2), y(1), y(3/2), | p(N _ 1)}. 
. Given {p(4) ne Z\, the scaling equation can be used to find 


te(G) ne Z}: 
Equation: 


2 
3 
| 
o 
M 
= 2 
i 
= 
3, 
6 
~|3 
| 
<2 


= V2 pp hr2 lp]p1[m —p| 


where h+2[p] denotes the impulse response of H (z”), ie., a 2- 
upsampled version of h[n], and where y1[m] = y(2). Note that 
{p(4) n € Z} is the result of convolving h+2[n] with {y 1[n] \. 


. Given {yp(+) ne Z\, another convolution yields {y(4) ne Z}: 
Equation: 


| 
wl 
wy 
Sj 
— 
Fs 
i 
6 
3 
| 
3. 


where h+,4|n] is a 4-upsampled version of h|n] and where 
g1[m] = 9(7). 
5. At the £* stage, {yp (+) \ is calculated by convolving the result of the 
é — 1™ stage with a 24~1-upsampled version of A[n]: 
Equation: 


p1(m) = V2) 7 hyo plp_a_|m — pl 


For £ ~ 10, this gives a very good approximation of y(t). At this point, 
you could verify the key properties of y(t), such as orthonormality and the 
satisfaction of the scaling equation. 


In [link] we show steps 1 through 5 of the cascade algorithm, as well as step 
10, using Daubechies' db2 coefficients (for which NV = 4). 


teramon 0 terten 
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Finite-Length Sequences and the DWT Matrix 


The wavelet transform, viewed from a filterbank perspective, consists of 
iterated 2-channel analysis stages like the one in [link]. 


+42 | cesta 


+ 12 = des ifn) 


First consider a very long (i.e., practically infinite-length) sequence 
{cz[m]|m € Z}. For every pair of input samples {c,[2n], cz,[2n — 1]} that 
enter the k'® filterbank stage, exactly one pair of output samples 

{cz41|n], dz41|n]} are generated. In other words, the number of output 
equals the number of input during a fixed time interval. This property is 
convenient from a real-time processing perspective. 


For a short sequence {cz[m]|m € {0,..., M@ — 1}}, however, linear 
convolution requires that we make an assumption about the tails of our finite- 
length sequence. One assumption could be 

Equation: 


VYm,m € {0,...,M —1}: (c,[m] = 0) 


In this case, the linear convolution implies that VW nonzero inputs yield 

a, =] outputs from each branch, for a total of 

2 (4* — 1) =M+N—2> M outputs. Here we have assumed that 
both H (z*) and G [ie have impulse response lengths of N > 2, and that 


M and WN are both even. The fact that each filterbank stage produces more 
outputs than inputs is very disadvantageous in many applications. 


A more convenient assumption regarding the tails of 

{cz[m]|m € {0,..., M — 1}} is that the data outside of the time window 
{0,..., M — 1} is a cyclic extension of data inside the time window. In other 
words, given a length-// sequence, the points outside the sequence are 
related to points inside the sequences via 

Equation: 


cz.[m] = c.[m + M] 


Recall that a linear convolution with an M/-cyclic input is equivalent to a 
circular convolution with one M/-sample period of the input sequences. 
Furthermore, the output of this circular convolution is itself M/-cyclic, 
implying our 2-downsampled branch outputs are cyclic with period 4. Thus, 
given an M-length input sequence, the total filterbank output consists of 
exactly M values. 


It is instructive to write the circular-convolution analysis fiterbank operation 
in matrix form. In [link] we give an example for filter length N = 4, 
sequence length NV = 8, and causal synthesis filters H(z) and G(z). 
Equation: 


Cx+1(0] nfo) All] Al2] Als] 0 0 0 0 \ /eg(0] 
Cr1(1] 0 0 Ald} Aft] Al2) A] 0 Oo || eit] 
Cn41(2I 0 0 0 O- Alo} Alt) Al2] Al] | | ex[2] 
ce+i[3] | — | [2] a3] 0 0 O 0  AfO] All] | | ex[3! 
di+1[0] g[0] gi1] gf2] gi} 0 O O DO |] cx[4! 
di+i[1] 0 0 gid} oft] g[2) gf8} 0 O |] cal5] 
di.+1[2] 0 0 0 0O- gd} g{1] g[2] gf3] | | cxl6] 
di.+1[3] 92} gi3} 09 0 O O- gf} gi1]/ \cxl7] 
where 
cx41[0] 
cr4i(l] 
Cr+1[2] 
Ck+1 ck +1[3] 
ic) | dyy1[0] 
di+1(1] 
die+1[2] 
dx.+1[3] 


nfo] rll] hi] Ais} 0 O O O 
0 0 Aldj Alt] Al2] Als] 0 0 
0 0 0 O Alo} Alt) Al2] AlB] 
eae nf] AB] 0 O 0 O Alo] alt 
Gu/ | gl0] g{1] 92] 93} 0 O 0 0 
0 0 gd) oft] g[2] gf3} 0 0 
0 0 O 0 gd} oft] 92] 93] 
gl2} gs} 09 OO O OO gO} gf] 

cx|0| 

cr(1| 

cr(2| 

_ | ex(3] 

| end 

cr(5| 

cx|6| 

cxl7| 


The matrices Hy and G' yz have interesting properties. For example, the 
conditions 


ae eu) h|[n — 2m] 


g[n] = -1°A[N -—1—- nl] 


Hu\’ (Hu\ (Hu Hy\" _; 
Ga) \Gus Gas Cas 
where Ij, denotes the MxM identity matrix. Thus, it makes sense to define 


the MxM DWT matrix as 
Equation: 


imply that 


Hy 
ips 
u (ar) 


whose transpose constitutes the I/x/M inverse DWT matrix: 
Equation: 


Ty = Ty 


Since the synthesis filterbank ([link]) 


ce+1[n] —+{42|}-+4 H(z) -— 


Ch} c[m] 


deii[n] — T2}—e) Giz) 


gives perfect reconstruction, and since the cascade of matrix operations 
Tu? Ty also corresponds to perfect reconstruction, we expect that the matrix 
operation T'yy7 describes the action of the synthesis filterbank. This is readily 
confirmed by writing the upsampled circular convolutions in matrix form: 
Equation: 


cx.[0] ho) 0 0 Ald glo} 0 0 gl2l\ /cx+(0) 
cx[1] hil] 0 0 Af] git] 0 0 gf] ] J curl 
cp(2| h[2| nfo] O O- gf2] go] O O || ceuf2] 
cp[3| A[3] All] O O- gf3] gil] O O | | ceri] 
crl4] | | 0 Al] Alo] 0 0 gl2] glo} 0 |} dks[0) 
cx[5] 0 rf] A] 0 0 gf] oft] 0 | | deaf] 
cx(6] 0 0 Af] Ao] 0 0 gl2} g[O]} | dess[2) 
cx[7] 0 0 Af] All] 0 0 gf3] gf] \de+s([3] 


where 


hio} 0 O Af2] glo} 0 O gf2) 
hil] 0 O As] gf} O OO gf} 
h2]) nfo] O O g[2] glo] 0 oO 
en _p rf rs] ha] 0 0 gf3} oft] 0 0 
Ge “1 0 A] Alo] 0 0 gl2] glo} 0 
0 Al] All] 0 0 gf8] oft] 0 
0 0 hAf2] Ao] O 0 gf2] gO! 
0 0 Als] Aft] O O- gf3} ofl] 


So far we have concentrated on one stage in the wavelet decomposition; a 
two-stage decomposition is illustrated in [link]. 


ey [m) i H(z!) | +! {2| + exfn) 
r H(z") | 42 |. G(z7) os 42 |= dan} 
eal)—=| Ge) +142] x dy{m| 


The two-stage analysis operation (assuming circular convolution) can be 
expressed in matrix form as 


Equation: 
Ck+2 Tx 0 Gua 
d = 
ai > £2 ee 


dr+1 
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Similarly, a three-stage analysis could be implemented via 
Equation: 


Ck+3 Tu 0 0 r P 
d M 
[=] 0 In 0 (Tu) (ck) 
dp+2 0 Iu 
0 0 Im ¢ 
Ahy1 a 


It should now be evident how to extend this procedure to 3 stages. As noted 
earlier, the corresponding synthesis operations are accomplished by 
transposing the matrix products used in the analysis. 


DWT Implementation using FFTs 


Finally, we say a few words about DWT implementation. Here we focus on 
a single DWT stage and assume circular convolution, yielding an MxM 
DWT matrix Ty . In the general case, MxM matrix multiplication requires 
M2 multiplications. The DWT matrices, however, have a circular- 
convolution structure which allows us to implement them using 
significantly less multiplies. Below we present some simple and reasonably 
efficient approaches for the implementation of Ty and Tie. 


We treat the inverse DWT first. Recall that in the lowpass synthesis branch, 
we upsample the input before circularly convolving with H(z). Denoting 
the upsampled coefficient sequence by a[n], fast circular convolution 
a|n]*h|n] can be described as follows (using Matlab notation) 


ifft( fft(a).*fft(h,length(a)) ) 


where we have assumed that length(a) = Length(h). [footnote] 
The highpass branch is handled similarly using G(z), after which the two 
branch outputs are summed. 

When implementing the multi-level transform, you must ensure that the 
data length does not become shorter than the filter length! 


Next we treat the forward DWT. Recall that in the lowpass analysis branch, 
we circularly convolve the input with (2-7) and then downsample the 
result. The fast circular convolution a[n|*h[—n] can be implemented using 


wshift('1', 
ifft(fft(a).*fft(flipud(h),length(a))), 
length(h)-1 ) 


where WShift accomplishes a circular shift of the Lf ft output that makes 
up for the unwanted delay of Length(h) -1 samples imposed by the 


f1ipud operation. The highpass branch is handled similarly but with filter 
G ) Finally, each branch is downsampled by factor two. 


We note that the proposed approach is not totally efficient because 
downsampling is performed after circular convolution (and upsampling 
before circular convolution). Still, we have outlined this approach because 
it is easy to understand and still results in major saving when M is large: it 
converts the O(M matrix multiply into an O(M log, M) operation. 


DWT Applications - Choice of phi(t) 


Transforms are signal processing tools that are used to give a clear view of 
essential signal characteristics. Fourier transforms are ideal for infinite- 
duration signals that contain a relatively small number of sinusoids: one can 
completely describe the signal using only a few coefficients. Fourier 
transforms, however, are not well-suited to signals of a non-sinusoidal 
nature (as discussed earlier in the context of time-frequency analysis). The 
multi-resolution DWT is a more general transform that is well-suited to a 
larger class of signals. For the DWT to give an efficient description of the 
signal, however, we must choose a wavelet 7(t) from which the signal can 
be constructed (to a good approximation) using only a few stretched and 
shifted copies. 


We illustrate this concept in [link] using two examples. On the left, we 
analyze a step-like waveform, while on the right we analyze a chirp-like 
waveform. In both cases, we try DWTs based on the Haar and Daubechies 
db10 wavelets and plot the log magnitudes of the transform coefficients 
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Observe that the Haar DWT yields an extremely efficient representation of 
the step-waveform: only a few of the transform coefficients are nonzero. 


The db10 DWT does not give an efficient representation: many 
coefficients are sizable. This makes sense because the Haar scaling function 
is well matched to the step-like nature of the time-domain signal. In 
contrast, the Haar DWT does not give an efficient representation of the 
chirp-like waveform, while the dd10 DWT does better. This makes sense 
because the sharp edges of the Haar scaling function do not match the 
smooth chirp signal, while the smoothness of the db10 wavelet yields a 
better match. 


DWT Application - De-noising 


Say that the DWT for a particular choice of wavelet yields an efficient 
representation of a particular signal class. In other words, signals in the 
class are well-described using a few large transform coefficients. 


Now consider unstructured noise, which cannot be eifficiently represented 
by any transform, including the DWT. Due to the orthogonality of the 
DWT, such noise sequences make, on average, equal contributions to all 
transform coefficients. Any given noise sequence is expected to yield many 
small-valued transform coefficients. 


Together, these two ideas suggest a means of de-noising a signal. Say that 
we perform a DWT on a signal from our well-matched signal class that has 
been corrupted by additive noise. We expect that large transform 
coefficients are composed mostly of signal content, while small transform 
coefficients should be composed mostly of noise content. Hence, throwing 
away the transform coefficients whose magnitude is less than some small 
threshold should improve the signal-to-noise ratio. The de-noising 
procedure is illustrated in [link]. 


noisy signal —+{ pwr | +lthreshoid threshold _IDWT | i—= de-noised signal 


Now we give an example of denoising a step-like waveform using the Haar 
DWT. In [link], the top right subplot shows the noisy signal and the top left 
shows it DWT coefficients. Note the presence of a few large DWT 
coefficients, expected to contain mostly signal components, as well as the 
presence of many small-valued coefficients, expected to contain noise. (The 
bottom left subplot shows the DWT for the original signal before any noise 
was added, which confirms that all signal energy is contained within a few 
large coefficients.) If we throw away all DWT coefficients whose 
magnitude is less than 0.1, we are left with only the large coefficients 
(shown in the middle left plot) which correspond to the de-noised time- 
domain signal shown in the middle right plot. The difference between the 
de-noised signal and the original noiseless signal is shown in the bottom 
right. Non-zero error results from noise contributions to the large 
coefficients; there is no way of distinguishing these noise components from 
signal components. 
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